Memorias de investigación
Communications at congresses:
On the distribution of source code file sizes
Year:2011

Research Areas
  • Information technology and adata processing

Information
Abstract
Source code size is an estimator of software effort. Size is also often used to calibrate models and equations to estimate the cost of software. The distribution of source code file sizes has been shown in the literature to be a lognormal distribution. In this paper, we measure the size of a large collection of software (the Debian GNU/Linux distribution version 5.0.2), and we find that the statistical distribution of its source code file sizes follows a double Pareto distribution. This means that large files are to be found more often than predicted by the lognormal distribution, therefore the previously proposed models underestimate the cost of software.
International
Si
Congress
ICSOFT 2011 - International Conference on Software and Data Technologies
960
Place
Sevilla, España
Reviewers
Si
ISBN/ISSN
978-989-8425-77-5
Start Date
18/07/2011
End Date
21/06/2012
From page
5
To page
14
Proceedings of the 6th International Conference on Software and Data Technologies, Volume 2
Participants
  • Autor: Israel Herraiz Tabernero UPM

Research Group, Departaments and Institutes related
  • Creador: Grupo de Investigación: Matemática e Informática Aplicadas a la Ingeniería civil