Analysing Quality of Textual Requirements Using Natural Language Processing: A Literature Review

Author(s):  
Jerzy Kocerka ◽  
Michal Krzeslak ◽  
Adam Galuszka

Vector representations for language have been shown to be useful in a number of Natural Language Processing tasks. In this paper, we aim to investigate the effectiveness of word vector representations for the problem of Sentiment Analysis. In particular, we target three sub-tasks namely sentiment words extraction, polarity of sentiment words detection, and text sentiment prediction. We investigate the effectiveness of vector representations over different text data and evaluate the quality of domain-dependent vectors. Vector representations has been used to compute various vector-based features and conduct systematically experiments to demonstrate their effectiveness. Using simple vector based features can achieve better results for text sentiment analysis of APP.


2020 ◽  
Vol 8 ◽  
Author(s):  
Majed Al-Jefri ◽  
Roger Evans ◽  
Joon Lee ◽  
Pietro Ghezzi

Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-related news stories using natural language processing and machine learning.Materials and Methods: We used a database from the website HealthNewsReview.org that aims to improve the public dialogue about health care. HealthNewsReview.org developed a set of criteria to critically analyze health care interventions' claims. In this work, we attempt to automate the evaluation process by identifying the indicators of those criteria using natural language processing-based machine learning on a corpus of more than 1,300 news stories. We explored features ranging from simple n-grams to more advanced linguistic features and optimized the feature selection for each task. Additionally, we experimented with the use of pre-trained natural language model BERT.Results: For some criteria, such as mention of costs, benefits, harms, and “disease-mongering,” the evaluation results were promising with an F1 measure reaching 81.94%, while for others the results were less satisfactory due to the dataset size, the need of external knowledge, or the subjectivity in the evaluation process.Conclusion: These used criteria are more challenging than those addressed by previous work, and our aim was to investigate how much more difficult the machine learning task was, and how and why it varied between criteria. For some criteria, the obtained results were promising; however, automated evaluation of the other criteria may not yet replace the manual evaluation process where human experts interpret text senses and make use of external knowledge in their assessment.


2021 ◽  
Author(s):  
Sena Chae ◽  
Jiyoun Song ◽  
Marietta Ojo ◽  
Maxim Topaz

The goal of this natural language processing (NLP) study was to identify patients in home healthcare with heart failure symptoms and poor self-management (SM). The preliminary lists of symptoms and poor SM status were identified, NLP algorithms were used to refine the lists, and NLP performance was evaluated using 2.3 million home healthcare clinical notes. The overall precision to identify patients with heart failure symptoms and poor SM status was 0.86. The feasibility of methods was demonstrated to identify patients with heart failure symptoms and poor SM documented in home healthcare notes. This study facilitates utilizing key symptom information and patients’ SM status from unstructured data in electronic health records. The results of this study can be applied to better individualize symptom management to support heart failure patients’ quality-of-life.


Author(s):  
Rahul Sharan Renu ◽  
Gregory Mocko

The objective of this research is to investigate the requirements and performance of parts-of-speech tagging of assembly work instructions. Natural Language Processing of assembly work instructions is required to perform data mining with the objective of knowledge reuse. Assembly work instructions are key process engineering elements that allow for predictable assembly quality of products and predictable assembly lead times. Authoring of assembly work instructions is a subjective process. It has been observed that most assembly work instructions are not grammatically complete sentences. It is hypothesized that this can lead to false parts-of-speech tagging (by Natural Language Processing tools). To test this hypothesis, two parts-of-speech taggers are used to tag 500 assembly work instructions (obtained from the automotive industry). The first parts-of-speech tagger is obtained from Natural Language Processing Toolkit (nltk.org) and the second parts-of-speech tagger is obtained from Stanford Natural Language Processing Group (nlp.stanford.edu). For each of these taggers, two experiments are conducted. In the first experiment, the assembly work instructions are input to the each tagger in raw form. In the second experiment, the assembly work instructions are preprocessed to make them grammatically complete, and then input to the tagger. It is found that the Stanford Natural Language Processing tagger with the preprocessed assembly work instructions produced the least number of false parts-of-speech tags.


2021 ◽  
Author(s):  
Anahita Davoudi ◽  
Hegler Tissot ◽  
Abigail Doucette ◽  
Peter E Gabriel ◽  
Ravi B. Parikh ◽  
...  

One core measure of healthcare quality set forth by the Institute of Medicine is whether care decisions match patient goals. High-quality "serious illness communication" about patient goals and prognosis is required to support patient-centered decision-making, however current methods are not sensitive enough to measure the quality of this communication or determine whether care delivered matches patient priorities. Natural language processing offers an efficient method for identification and evaluation of documented serious illness communication, which could serve as the basis for future quality metrics in oncology and other forms of serious illness. In this study, we trained NLP algorithms to identify and characterize serious illness communication with oncology patients.


Online business has opened up several avenues for researchers and computer scientists to initiate new research models. The business activities that the customers accomplish certainly produce abundant information /data. Analysis of the data/information will obviously produce useful inferences and many declarations. These inferences may support the system in improving the quality of service, understand the current market requirement, Trend of the business, future need of the society and so on. In this connection the current paper is trying to propose a feature extraction technique named as Business Sentiment Quotient (BSQ). BSQ involves word2vec[1] word embedding technique from Natural Language Processing. Number of tweets related to business are accessed from twitter and processed to estimate BSQ using python programming language. BSQ may be utilized for further Machine Learning Activities.


Author(s):  
Youssef Damak ◽  
Marija Jankovic ◽  
Yann Leroy ◽  
Karim Chelbi

AbstractThe R&D of Autonomous Transportation Systems (ATS) is hindered by the lack of industrial feedback and client's knowledge about technological possibilities. In addition, because of intellectual properties (IP) issues, technology consulting companies can't directly reuse developed functionalities with different clients. In this context, requirements reuse technics presents a good way to capitalize on their knowledge while avoiding IP issues. However, the literature review on requirements reuse processes doesn't propose methods to the application of reuse processes with little information about the system's operational context. In this paper, we present a semi-automated requirement reuse and recycle process for ATS R&D. The process helps designers’ copes with the lack of inputs from the clients. Requirements candidates are retrieved from a database using Natural Language Processing and traceability propagation. It is applied to 3 use cases with inputs less than 5 concepts from the client's needs. The results validate its efficiency through number requirements retrieved and the analysis time consumption


2018 ◽  
Vol 25 (4) ◽  
pp. 435-458
Author(s):  
Nadezhda S. Lagutina ◽  
Ksenia V. Lagutina ◽  
Aleksey S. Adrianov ◽  
Ilya V. Paramonov

The paper reviews the existing Russian-language thesauri in digital form and methods of their automatic construction and application. The authors analyzed the main characteristics of open access thesauri for scientific research, evaluated trends of their development, and their effectiveness in solving natural language processing tasks. The statistical and linguistic methods of thesaurus construction that allow to automate the development and reduce labor costs of expert linguists were studied. In particular, the authors considered algorithms for extracting keywords and semantic thesaurus relationships of all types, as well as the quality of thesauri generated with the use of these tools. To illustrate features of various methods for constructing thesaurus relationships, the authors developed a combined method that generates a specialized thesaurus fully automatically taking into account a text corpus in a particular domain and several existing linguistic resources. With the proposed method, experiments were conducted with two Russian-language text corpora from two subject areas: articles about migrants and tweets. The resulting thesauri were assessed by using an integrated assessment developed in the previous authors’ study that allows to analyze various aspects of the thesaurus and the quality of the generation methods. The analysis revealed the main advantages and disadvantages of various approaches to the construction of thesauri and the extraction of semantic relationships of different types, as well as made it possible to determine directions for future study.


Author(s):  
Suchetha Vijayakumar ◽  
Nethravathi P. S.

Purpose: Research involves the creation and implementation of new ideas by keeping the existing work as a foundation. The literature review done in this paper is to familiarise and to know about the domain of research to integrate the existing ideas with the new ones. Methodology: The literature that is required for this study is chosen from multiple secondary sources such as journals, conference proceedings, and web resources. All the pieces of literature are carefully studied and summarised. This is further used to arrive at Research agendas and Research gaps. Findings/Result: It has been observed and understood that Natural Language Processing (NLP) is a field involving analysis and processing of textual contents. It also requires Machine Learning Algorithms to support the processing. This combination has already been used in various domains, the important one being the health sector. EMR data is huge and NLP can successfully process and prioritize them in different dimensions. In that direction, the same concept and technology can be applied to Software Engineering also and Requirements can be prioritized. Originality: This literature review study is carried out using secondary data which is collected through various online sources. The information thus gathered will be used in the future to build upon existing theory and framework or build a new methodology. It is also seen that any conclusion or decision is not biased or unidirectional. A sincere effort is made to identify a research topic to carry out the research. Paper Type: Literature Review.


Sign in / Sign up

Export Citation Format

Share Document