A comparison of software effort prediction models using small datasets
In: 688; (2011)
Online
unknown
Zugriff:
Submitted to IEEE Transactions on Software Engineering. If published, this version will be replaced by the final version. ; Constructing an accurate effort prediction model is a challenge in Software Engineering. One difficulty practitioners often experience is that they have only a very small amount of local data to construct a model. The small dataset limits predictive accuracy of the model, since the accuracy deteriorates as the size of the dataset decreases. This paper compares three different software development effort prediction models that are applicable to these small datasets. They are: (1) Bayesian statistical models, (2) multiple linear regression models and (3) case-based reasoning/analogy-based models. The predictive accuracy of these models is evaluated using two different software datasets. The results have shown that the accuracy of the Bayesian statistical models is higher than or competitive with that of the others, when calibrated using data collected from fewer than 10 systems. These suggest that the Bayesian statistical model would be a better choice in effort prediction when the practitioners have only a very small dataset, consisting of fewer than 10 systems similar to their system of interest. ; Submitted ; [1] C.J. Burgess and M. Lefley. Can genetic programming improve software effort estimation? a comparative evaluation. Information and Software Technology, 43:863–873, 2001. [2] P. Congdon. Bayesian Statistical Modelling. John Wiley & Sons., 2001. [3] S.D. Conte, H.E. Dunsmore, and V.Y. Shen. Software Engineering Metrics and Models. Benjamin/Cummings Publishing Company, 1986. [4] N.E. Fenton and S.L. Pfleeger. Software Metrics:A Rigorous & Practical Approach. PWS Publishing Company, second edition, 1997. [5] T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. A simulation study of the model evaluation criterion mmre. IEEE Transactions on Software Engineering, 29(11):985–995, 2003. [6] R.L. Glass. Frequently forgotten fundamental facts about software engineering. IEEE Software, May/June:110–112, 2001. [7] A.R. Gray. A simulation-based comparison of empirical modeling techniques for software metric models of development effort. In Proceedings of the 6th International Conference on Neural Information Processing, pages 526–531, 1999. [8] A.R. Gray and S.G. MacDonell. A comparison of alternatives to regression analysis as model building techniques to develop predictive equations for software metrics. Information and Software Technology, 39:425–437, 1997. [9] A.R. Gray and S.G. MacDonell. Software metrics data analysis- exploring the relative performance of some commonly used modeling technique. Empirical Software Engineering, 4:297–316, 1999. [10] P.J. Green. A primer on markov chain monte carlo. In O.E. Barndorff-Nielsen, D.R. Cox, and C. Kl ̈uppelberg, editors, Complex Stochastic Systems, chapter 1, pages 1–62. Chapman & Hall/CRC, 2001. [11] F.J. Heemstra. Software cost estimation. Information and Software Technology, 34(10):627–639, 1992. [12] R. Jeffery, M. Ruhe, and I. Wieczorek. A comparative study of two software development cost modeling techniques using multi-organizational and company-specific data. Information and Software Technology, 42:1009–1016, 2000. [13] M. Jørgensen. Experience with the accuracy of software maintenance task effort prediction models. IEEE Transactions on Software Engineering, 21(8):674–681, 1995. [14] B. Kitchenham, E. Mendes, and G.H. Travassos. A systematic review of cross- vs. within-company cost estimation studies. In Proceedings of the 10th International Conference on Evaluation and Assessment in Software Engineering (EASE2006), 2006. [15] B.A. Kitchenham. Empirical studies of assumptions that underlie software cost-estimation models. Information and Software Technology, 34(4):211–218, 1992. [16] B.A. Kitchenham, L.M. Pickard, S.G. MacDonell, and M.J. Shepperd. What accuracy statistics really measure. IEE Proceedings–Software, 148(3):81–85, 2001. [17] H. Lee. A structured methodology for software development effort prediction using the analytic hierarchy process. Journal of Systems Software, 21:179–186, 1993. [18] S.G. MacDonell. Establishing relationships between specification size and software process effort in CASE environment. Information and Software Technology, 39:35–45, 1997. [19] S.G. MacDonell and A.R. Gray. Alternatives to regression models for estimating software projects. In Proceedings of the IFPUG Fall Conference, 1996. [20] S.G. MacDonell and A.R. Gray. A comparison of modeling techniques for software development effort prediction. In Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems, pages 869–872, 1997. [21] S.G. MacDonell and M.J. Shepperd. Combining techniques to optimize effort predictions in software project management. The Journal of Systems and Software, 66:91–98, 2003. [22] C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster. An investigation of machine learning based prediction systems. The Journal of Systems and Software, 53:23–29, 2000. [23] K. Maxwell, L. Van Wassenhove, and S. Dutta. Performance evaluation of general and company specific models in software development effort estimation. Management Science, 45(6):787–803, 1999. [24] E. Mendes, C. Lokan, R. Harrison, and C. Triggs. A replicated comparison of cross-company and within-company effort estimation models using the isbsg database. In Proceedings of the 11th International Symposium on Software Metrics (METRICS’05), 2005. [25] C.S. Murali and C.S. Sankar. Issues in estimating real-time data communications software projects. Information and Software Technology, 39:399–402, 1997. [26] L. Pickard, B.A. Kitchenham, and S. Linkman. An investigation of analysis techniques for software datasets. In Proceedings of the 6th International Software Metrics Symposium (METRICS’99), pages 130–142, 1999. [27] J. Sayyad Shirabad and T.J. Menzies. The PROMISE repository of software engineering databases. School of Information Technology and Engineering, University of Ottawa, Canada, 2005. http://promise.site.uottawa.ca/SERepository. [28] M. Shepperd and G. Kadoda. Comparing software prediction techniques using simulation. IEEE Transactions on Software Engineering, 27(11):1014–1022, 2001. [29] M. Shepperd and C. Schofield. Estimating software project effort using analogy. IEEE Transactions on Software Engineering, 23(12):736–743, 1997. [30] K. Srinivasan and D. Fisher. Machine learning approaches to estimating software development effort. IEEE Transactions on Software Engineering, 21(2):126–136, 1995. [31] E. Stensrud. Alternative approaches to effort prediction of erp projects. Information and Software Technology, 43:413–423, 2001. [32] E. Stensrud, T. Foss, B.A. Kitchenham, and I. Myrtveit. An empirical validation of the relationship between the magnitude of relative error and project size. In Proceedings of the 8th IEEE Symposium on Software Metrics (METRICS’02), pages 3–12, 2002. [33] C. van Koten. Bayesian statistical models for predicting software development effort. The Journal of Systems and Software, Submitted in 2006. [34] C. van Koten and A.R. Gray. Bayesian statistical effort prediction models for data-centred 4gl software development. Information and Software Technology, 48:1056–1067, 2006.
Titel: |
A comparison of software effort prediction models using small datasets
|
---|---|
Autor/in / Beteiligte Person: | van Koten, Chikako |
Link: | |
Quelle: | 688; (2011) |
Veröffentlichung: | University of Otago, 2011 |
Medientyp: | unknown |
Schlagwort: |
|
Sonstiges: |
|