A Machine Learning Approach for Requirements Traceability in Model-Driven Development
Summary
[Context & Motivation] Requirements Traceability (RT) aims to follow and describe the lifecycle of a requirement. A multitude of standards require RT practices because they provide benefits in project management, project visibility, maintenance, and verification and validation.
[Problem] Many of these RT practices are carried out manually, which poses significant risks. Manual tracing techniques are prone to mistakes, vulnerable to changes, time-consuming, and difficult to maintain. The task of recovering traces should not be done manually but should instead be automated. However, this is an issue since existing automatic tracing tools have shortcomings, as evidenced by the low tool penetration.
[Results] We propose to tackle this problem by using machine learning (ML) techniques. This research presents the design of a tracing tool for automatically recovering traces between JIRA issues and commits in a model-driven development (MDD) context. Using process and text-based data, we created 154 features to train a ML classifier. This classifier was then validated using four real MDD industry datasets. We were able to get an average F2-sore of 73.48 with the best tested configuration, for a situation where we could recommend traces to a developer. An F0.5-score of 77.32 was obtained in the scenario of automatically maintaining traces of a current project.
[Contribution] The findings of this study demonstrate that state-of-the-art trace recovery techniques can successfully be implemented in an MDD-context, bridging the gap between academia and industry.