Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorDalpiaz, F.
dc.contributor.advisorLucassen, G.G.
dc.contributor.authorArendse, B.
dc.date.accessioned2016-08-22T17:01:02Z
dc.date.available2016-08-22T17:01:02Z
dc.date.issued2016
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/23654
dc.description.abstractOne of the main reasons why software projects fail is because of the poor quality of requirements, or the lack of any documented requirements. In the last two decades there has been an increasing interest in the development of tools using Natural Language Processing (NLP) to support the formulation, documentation and verification of NL requirements. Over the years numerous NLP tools have been developed to improve the quality of requirements. Because of the many approaches available it is not clear how the approaches relate to each other. The goal of this thesis is therefore to get a clear overview of the performance of the main approaches taken by NLP tools in the requirements engineering (RE) landscape, and to create a theoretical tool that synergistically integrates the best approaches. The scope is on finding defects and deviations in natural language requirements. A literature study is performed to identify the main 50 NLP tools in the (RE) landscape. After an initial analysis 3 tools are selected for further analysis. Derived from the features of these 3 tools a requirement standard is created to specify what a quality defect is for each feature. Using the requirement standard 4 datasets are tagged for quality defects. These tagged datasets are compared against the output of the tools using the metrics precision and recall to measure the performance of the features of the 3 tools. Based on the performance of the features and a qualitative analysis of the approaches of those features a set of good and bad practices is derived: 1. Different tokenizers: The choice to which tokenizer to use can have an effect (both positive and negative) on the performance of a tool 2. Dictionary vs. Parsing: Using a dictionary is a safe and simple method to detect defects. Parsing is a more complicated approaches, and when not performed correctly it can have a negative effect on the performance of a tool 3. What is in the dictionary: The size and content of a dictionary can have an effect on the performance (both recall and precision) of a tool, the bigger the dictionary, the better The performance of the features and the set of good and bad practices lead to the design of a next generation tool. This tool incorporates the best performing approaches (regarding recall) for each feature specified in the requirement standard. NLP tool developers can use the set of good and bad practices and the design of the next generation tool for the development of their own NLP tools.
dc.description.sponsorshipUtrecht University
dc.format.extent2123893
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleA thorough comparison of NLP tools for requirements quality improvement
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsNatural Language Processing, Requirements Engineering, Quality of Requirements, Mashup
dc.subject.courseuuBusiness Informatics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record