scholarly journals Efficient English text classification using selected Machine Learning Techniques

2021 ◽  
Vol 60 (3) ◽  
pp. 3401-3409 ◽  
Author(s):  
Xiaoyu Luo
Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


Author(s):  
Gaurav S. Chavan ◽  
◽  
Sagar Manjare ◽  
Parikshit Hegde ◽  
Amruta Sankhe

2020 ◽  
Vol 1566 ◽  
pp. 012066
Author(s):  
K Jamal ◽  
R Kurniawan ◽  
A S Batubara ◽  
M Z A Nazri ◽  
F Lestari ◽  
...  

2019 ◽  
Vol 18 (03) ◽  
pp. 1950033
Author(s):  
Madan Lal Yadav ◽  
Basav Roychoudhury

One can either use machine learning techniques or lexicons to undertake sentiment analysis. Machine learning techniques include text classification algorithms like SVM, naive Bayes, decision tree or logistic regression, whereas lexicon-based sentiment analysis uses either general or domain-based lexicons. In this paper, we investigate the effectiveness of domain lexicons vis-à-vis general lexicon, wherein we have performed aspect-level sentiment analysis on data from three different domains, viz. car, guitar and book. While it is intuitive that domain lexicons will always perform better than general lexicons, the actual performance however may depend on the richness of the concerned domain lexicon as well as the text analysed. We used the general lexicon SentiWordNet and the corresponding domain lexicons in the aforesaid domains to compare their relative performances. The results indicate that domain lexicon used along with general lexicon performs better as compared to general lexicon or domain lexicon, when used alone. They also suggest that the performance of domain lexicons depends on the text content; and also on whether the language involves technical or non-technical words in the concerned domain. This paper makes a case for development of domain lexicons across various domains for improved performance, while gathering that they might not always perform better. It further highlights that the importance of general lexicons cannot be underestimated — the best results for aspect-level sentiment analysis are obtained, as per this paper, when both the domain and general lexicons are used side by side.


Author(s):  
Damian Alberto

The manual classification of a large amount of textual materials are very costly in time and personnel. For this reason, a lot of research has been devoted to the problem of automatic classification and work on the subject dates from 1960. A lot of text classification software has appeared. For some tasks, automatic classifiers perform almost as well as humans, but for others, the gap is still large. These systems are directly related to machine learning. It aims to achieve tasks normally affordable only by humans. There are generally two types of learning: learning “by heart,” which consists of storing information as is, and learning generalization, where we learn from examples. In this chapter, the authors address the classification concept in detail and how to solve different classification problems using different machine learning techniques.


Sign in / Sign up

Export Citation Format

Share Document