Bayesian statistical effort prediction models for data-centred 4GL software development
In: Information Science Discussion Papers Series; 24; (2011)
Online
report
Zugriff:
Constructing an accurate effort prediction model is a challenge in Software Engineering. This paper presents three Bayesian statistical software effort prediction models for database-oriented software systems, which are developed using a specific 4GL tool suite. The models consist of specification-based software size metrics and development team’s productivity metric. The models are constructed based on the subjective knowledge of human expert and calibrated using empirical data collected from 17 software systems developed in the target environment. The models’ predictive accuracy is evaluated using subsets of the same data, which were not used for the models’ calibration. The results show that the models have achieved very good predictive accuracy in terms of MMRE and pred measures. Hence it is confirmed that the Bayesian statistical models can predict effort successfully in the target environment. In comparison with commonly used multiple linear regression models, the Bayesian statistical models’ predictive accuracy is equivalent in general. However, when the number of software systems used for the models’ calibration becomes smaller than five, the predictive accuracy of the best Bayesian statistical models are significantly better than the multiple linear regression model. This result suggests that the Bayesian statistical models would be a better choice when software organizations/practitioners do not posses sufficient empirical data for the models’ calibration. The authors expect those findings encourage more researchers to investigate the use of Bayesian statistical models for predicting software effort. ; Unpublished ; A.J. Albrecht and J.E. Gaffney. Software function, source lines of code, and development effort prediction: a software science validation. IEEE Transactions on Software Engineering, SE–9(6):639–648, 1983. J. Baik, B. Boehm, and B.M. Steece. Disaggregating and calibrating the CASE tool variable in COCOMO II. IEEE Transactions on Software Engineering, 28(11):1009–1022, 2002. B.W. Boehm. Software Engineering Economics. Prentice–Hall, 1981. S. Chulani, B. Boehm, and B.M. Steece. Bayesian analysis of empirical software engineering cost models. IEEE Transactions on Software Engineering, 25(4):513–583, 1999. P. Congdon. Bayesian Statistical Modelling. John Wiley & Sons., 2001. S.D. Conte, H.E. Dunsmore, and V.Y. Shen. Software Engineering Metrics and Models. Benjamin/Cummings Publishing Company, 1986. R.G. Cowell, A.P. Dawid, S.L. Lauritzen, and D.J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer–Verlag New York, 1999. J.J. Dolado. A study of the relationships among albrecht and mark ii function points, lines of code, 4gl and effort. Journal of Systems Software, 37:161–173, 1997. J.J. Dolado. A validation of the component-based method for software size estimation. IEEE Transactions on Software Engineering, 26(10):1006–1021, 2000. N. Fenton and M. Neil. A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5):675–689, 1999. N.E. Fenton and S.L. Pfleeger. Software Metrics:A Rigorous & Practical Approach. PWS Publishing Company, second edition, 1997. T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. A simulation study of the model evaluation criterion mmre. IEEE Transactions on Software Engineering, 29(11):985–995, 2003. R.L. Glass. Frequently forgotten fundamental facts about software engineering. IEEE Software, May/June:110–112, 2001. P.J. Green. A primer on markov chain monte carlo. In O.E. Barndorff-Nielsen, D.R. Cox, and C. Klüppelberg, editors, Complex Stochastic Systems, chapter 1, pages 1–62. Chapman & Hall/CRC, 2001. F.V. Jensen. Bayesian Networks and Decision Graphs. Springer–Verlag New York, 2001. C.F. Kemerer. An empirical validation of software cost estimation models. Communications of the ACM, 30(5):416–429, 1987. B.A. Kitchenham, L.M. Pickard, S.G. MacDonell, and M.J. Shepperd. What accuracy statistics really measure. IEE Proceedings–Software, 148(3):81–85, 2001. S.G. MacDonell. Establishing relationships between specification size and software process effort in case environment. Information and Software Technology, 39:35–45, 1997. M. Neil, N. Fenton, and L. Nielsen. Building large-scale bayesian networks. The Knowledge Engineering Review, 15(3):257–284, 2000. M.J. Shepperd, M. Cartwright, and G. Kadoda. On building prediction systems for software engineers. Empirical Software Engineering, 5:175–182, 2000. I. Stamelos, L. Angelis, P. Dimou, and E. Sakellaris. On the use of Bayesian belief networks for the prediction of software productivity. Information and Software Technology, 45:51–60, 2003. E. Stensrud, T. Foss, B.A. Kitchenham, and I. Myrtveit. An empirical validation of the relationship between the magnitude of relative error and project size. In Proceedings of the 8th IEEE Symposium on Software Metrics (METRICS’02), pages 3–12, 2002. B. Stewart. Predicting project delivery rates using the Naive–Bayes classifier. Journal of Software Maintenance and Evolution: Research and Practice, 14:161–179, 2002. G. Tate and J.M. Verner. Approaches to measuring size of application products with case tools. Information and Software Technology, 33(9):622–628, 1991. C. van Koten and A.R. Gray. An application of bayesian network for predicting object-oriented software maintainability. Information and Software Technology, in press, 2005. J.M. Verner and G. Tate. A software size model. IEEE Transactions on Software Engineering, 18(4):265–278, 1992.
Titel: |
Bayesian statistical effort prediction models for data-centred 4GL software development
|
---|---|
Autor/in / Beteiligte Person: | van Koten, Chikako ; Gray, Andrew |
Link: | |
Quelle: | Information Science Discussion Papers Series; 24; (2011) |
Veröffentlichung: | University of Otago, 2011 |
Medientyp: | report |
Schlagwort: |
|
Sonstiges: |
|