Leveraging Open-Source Data for Software Cost Estimation: A Predictive Modeling Approach
Summary
Software cost estimation is a long-standing research area in software engineering. The diverse array of cost-affecting factors, coupled with the dynamic nature of software development, necessitates constant caution in this domain. Over the years, several strategies and models were formulated to tackle this issue, each presenting different degrees of success and usability.
This study introduces a software cost estimation tool that significantly improves the current scenario by automating the data extraction process from an online software project repository. This tool, devised through extensive research and expert insights, collects, aggregates, and stores project data in a dataset, creating a comprehensive knowledge base.
The automatic extraction and aggregation of project data overcome the manual and time-consuming data collection processes prevalent in the current scenario, thereby enhancing efficiency and precision. It provides an easily accessible and ready-to-use repository for researchers, enabling them to experiment and identify critical factors affecting software costs without being burdened by the data collection process.
Furthermore, our data repository allows for software effort estimation using various machine-learning techniques. Within the scope of this study, we implemented and evaluated four specific methods, offering researchers a launchpad for comparative analysis and refinement of existing models.
The implementation and application of the developed tool showcase its potential to improve the field. By offering a novel perspective and methodology for software cost estimation, it contributes significantly to this research area. Furthermore, it lays the groundwork for future researchers to further explore this domain with ease and precision, indicating a promising direction for the evolution of software cost estimation methodologies.