Natural language processing and recurrent network models for identifying genomic mutation-associated cancer treatment change from patient progress notes

Meijian Guan; Samuel Cho; Robin Petro; Wei Zhang; Boris Pasche; Umit Topaloglu

doi:10.1093/jamiaopen/ooy061

Natural language processing and recurrent network models for identifying genomic mutation-associated cancer treatment change from patient progress notes

JAMIA Open ◽

10.1093/jamiaopen/ooy061 ◽

2019 ◽

Vol 2 (1) ◽

pp. 139-149 ◽

Cited By ~ 9

Author(s):

Meijian Guan ◽

Samuel Cho ◽

Robin Petro ◽

Wei Zhang ◽

Boris Pasche ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Cancer Patients ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Free Text ◽

Treatment Change ◽

Progress Notes

Abstract Objectives Natural language processing (NLP) and machine learning approaches were used to build classifiers to identify genomic-related treatment changes in the free-text visit progress notes of cancer patients. Methods We obtained 5889 deidentified progress reports (2439 words on average) for 755 cancer patients who have undergone a clinical next generation sequencing (NGS) testing in Wake Forest Baptist Comprehensive Cancer Center for our data analyses. An NLP system was implemented to process the free-text data and extract NGS-related information. Three types of recurrent neural network (RNN) namely, gated recurrent unit, long short-term memory (LSTM), and bidirectional LSTM (LSTM_Bi) were applied to classify documents to the treatment-change and no-treatment-change groups. Further, we compared the performances of RNNs to 5 machine learning algorithms including Naive Bayes, K-nearest Neighbor, Support Vector Machine for classification, Random forest, and Logistic Regression. Results Our results suggested that, overall, RNNs outperformed traditional machine learning algorithms, and LSTM_Bi showed the best performance among the RNNs in terms of accuracy, precision, recall, and F1 score. In addition, pretrained word embedding can improve the accuracy of LSTM by 3.4% and reduce the training time by more than 60%. Discussion and Conclusion NLP and RNN-based text mining solutions have demonstrated advantages in information retrieval and document classification tasks for unstructured clinical progress notes.

Download Full-text

Computerized Answer Grading

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35044 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 618-619

Author(s):

Anurag Langan

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Computer Technology ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Grade Student ◽

Processing Techniques

Grading student answers is a tedious and time-consuming task. A study had found that almost on average around 25% of a teacher's time is spent in scoring the answer sheets of students. This time could be utilized in much better ways if computer technology could be used to score answers. This system will aim to grade student answers using the various Natural Language processing techniques and Machine Learning algorithms available today.

Download Full-text

An Analysis of Machine Learning Algorithms and Deep Neural Networks for Email Spam Classification using Natural Language Processing

10.1109/soli54607.2021.9672398 ◽

2021 ◽

Author(s):

Md. Mohidul Hasan ◽

Syed Mahbubuz Zaman ◽

Md. Asif Talukdar ◽

Ayesha Siddika ◽

Md. Golam Rabiul Alam

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Email Spam

Download Full-text

Answer Script Evaluation using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35070 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 849-852

Author(s):

Dr. K. Suresh

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Computational Methods ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Text Extraction ◽

Processing Techniques

The current way of checking answer scripts is hectic for the college. They need to manually check the answers and allocate the marks to the students. Our proposed system uses Machine Learning and Natural Language Processing techniques to beat this. Machine learning algorithms use computational methods to find out directly from data without hopping on predetermined rules. NLP algorithms identify specific entities within the text, explore for key elements during a document, run a contextual search for synonyms and detect misspelled words or similar entries, and more. Our algorithm performs similarity checking and also the number of words associated with the question exactly matched between two documents. It also checks whether the grammar is correctly used or not within the student's answer. Our proposed system performs text extraction and evaluation of marks by applying Machine Learning and Natural Language Processing techniques.

Download Full-text

Classifying lymphoma and tuberculosis case reports using machine learning algorithms

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i5.3132 ◽

2021 ◽

Vol 10 (5) ◽

pp. 2857-2865

Author(s):

Moanda Diana Pholo ◽

Yskandar Hamam ◽

Abdel Baset Khalaf ◽

Chunling Du

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Performance Metrics ◽

Case Reports ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Tuberculosis Case ◽

Starting Point

Available literature reports several lymphoma cases misdiagnosed as tuberculosis, especially in countries with a heavy TB burden. This frequent misdiagnosis is due to the fact that the two diseases can present with similar symptoms. The present study therefore aims to analyse and explore TB as well as lymphoma case reports using Natural Language Processing tools and evaluate the use of machine learning to differentiate between the two diseases. As a starting point in the study, case reports were collected for each disease using web scraping. Natural language processing tools and text clustering were then used to explore the created dataset. Finally, six machine learning algorithms were trained and tested on the collected data, which contained 765 lymphoma and 546 tuberculosis case reports. Each method was evaluated using various performance metrics. The results indicated that the multi-layer perceptron model achieved the best accuracy (93.1%), recall (91.9%) and precision score (93.7%), thus outperforming other algorithms in terms of correctly classifying the different case reports.

Download Full-text

Citation Classification Prediction Implying Text Features Using Natural Language Processing and Supervised Machine Learning Algorithms

Communications in Computer and Information Science - Recent Trends in Image Processing and Pattern Recognition ◽

10.1007/978-981-16-0507-9_46 ◽

2021 ◽

pp. 540-552

Author(s):

Priya Porwal ◽

Manoj H. Devare

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Text Features ◽

Classification Prediction

Download Full-text

Grammatical categories determination for Turkish and Kazakh languages based on machine learning algorithms and fulfilling dictionaries of link grammar parser

Eastern-European Journal of Enterprise Technologies ◽

10.15587/1729-4061.2021.238743 ◽

2021 ◽

Vol 5 (2 (113)) ◽

pp. 55-65

Author(s):

Aigerim Yerimbetova ◽

Madina Tussupova ◽

Madina Sambetbayeva ◽

Mussa Turdalyuly ◽

Bakzhan Sakenov

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Parts Of Speech ◽

Grammatical Categories ◽

Learning Techniques

This research is aimed at identifying the parts of speech for the Kazakh and Turkish languages in an information retrieval system. The proposed algorithms are based on machine learning techniques. In this paper, we consider the binary classification of words according to parts of speech. We decided to take the most popular machine learning algorithms. In this paper, the following approaches and well-known machine learning algorithms are studied and considered. We defined 7 dictionaries and tagged 135 million words in Kazakh and 9 dictionaries and 50 million words in the Turkish language. The main problem considered in the paper is to create algorithms for the execution of dictionaries of the so-called Link Grammar Parser (LGP) system, in particular for the Kazakh and Turkish languages, using machine learning techniques. The focus of the research is on the review and comparison of machine learning algorithms and methods that have accomplished results on various natural language processing tasks such as grammatical categories determination. For the operation of the LGP system, a dictionary is created in which a connector for each word is indicated – the type of connection that can be created using this word. The authors considered methods of filling in LGP dictionaries using machine learning. The complexities of natural language processing, however, do not exclude the possibility of identifying narrower tasks that can already be solved algorithmically: for example, determining parts of speech or splitting texts into logical groups. However, some features of natural languages significantly reduce the effectiveness of these solutions. Thus, taking into account all word forms for each word in the Kazakh and Turkish languages increases the complexity of text processing by an order of magnitude

Download Full-text

Detection of social media platform insults using Natural language processing and comparative study of machine learning algorithms

2020 24th International Conference on System Theory, Control and Computing (ICSTCC) ◽

10.1109/icstcc50638.2020.9259730 ◽

2020 ◽

Author(s):

Sruthi Chiramel ◽

Doina Logofatu ◽

Gheorghe Goldenthal

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Comparative Study ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Social Media Platform ◽

Media Platform

Download Full-text

Graph and Natural Language Processing Based Recommendation System for Choosing Machine Learning Algorithms

2020 12th International Conference on Advanced Infocomm Technology (ICAIT) ◽

10.1109/icait51223.2020.9315570 ◽

2020 ◽

Author(s):

Y. Mahima ◽

T.N.D.S. Ginige

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recommendation System ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports

Journal of Digital Imaging ◽

10.1007/s10278-017-0027-x ◽

2017 ◽

Vol 31 (2) ◽

pp. 178-184 ◽

Cited By ~ 30

Author(s):

Po-Hao Chen ◽

Hanna Zafar ◽

Maya Galperin-Aizenberg ◽

Tessa Cook

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Radiology Reports

Download Full-text

Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2020-100262 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100262

Author(s):

Mustafa Khanbhai ◽

Patrick Anyadi ◽

Joshua Symons ◽

Kelsey Flott ◽

Ara Darzi ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Patient Experience ◽

Language Processing ◽

Performance Metrics ◽

Free Text ◽

Patient Feedback

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.

Download Full-text