learning classifier
Recently Published Documents


TOTAL DOCUMENTS

997
(FIVE YEARS 417)

H-INDEX

33
(FIVE YEARS 7)

2022 ◽  
Vol 4 ◽  
Author(s):  
Sandipan Sikdar ◽  
Rachneet Sachdeva ◽  
Johannes Wachs ◽  
Florian Lemmerich ◽  
Markus Strohmaier

This work quantifies the effects of signaling gender through gender specific user names, on the success of reviews written on the popular amazon.com shopping platform. Highly rated reviews play an important role in e-commerce since they are prominently displayed next to products. Differences in reviews, perceived—consciously or unconsciously—with respect to gender signals, can lead to crucial biases in determining what content and perspectives are represented among top reviews. To investigate this, we extract signals of author gender from user names to select reviews where the author’s likely gender can be inferred. Using reviews authored by these gender-signaling authors, we train a deep learning classifier to quantify the gendered writing style (i.e., gendered performance) of reviews written by authors who do not send clear gender signals via their user name. We contrast the effects of gender signaling and performance on the review helpfulness ratings using matching experiments. This is aimed at understanding if an advantage is to be gained by (not) signaling one’s gender when posting reviews. While we find no general trend that gendered signals or performances influence overall review success, we find strong context-specific effects. For example, reviews in product categories such as Electronics or Computers are perceived as less helpful when authors signal that they are likely woman, but are received as more helpful in categories such as Beauty or Clothing. In addition to these interesting findings, we believe this general chain of tools could be deployed across various social media platforms.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Ziyuan Jiang ◽  
Jiajin Li ◽  
Nahyun Kong ◽  
Jeong-Hyun Kim ◽  
Bong-Soo Kim ◽  
...  

AbstractAtopic dermatitis (AD) is a common skin disease in childhood whose diagnosis requires expertise in dermatology. Recent studies have indicated that host genes–microbial interactions in the gut contribute to human diseases including AD. We sought to develop an accurate and automated pipeline for AD diagnosis based on transcriptome and microbiota data. Using these data of 161 subjects including AD patients and healthy controls, we trained a machine learning classifier to predict the risk of AD. We found that the classifier could accurately differentiate subjects with AD and healthy individuals based on the omics data with an average F1-score of 0.84. With this classifier, we also identified a set of 35 genes and 50 microbiota features that are predictive for AD. Among the selected features, we discovered at least three genes and three microorganisms directly or indirectly associated with AD. Although further replications in other cohorts are needed, our findings suggest that these genes and microbiota features may provide novel biological insights and may be developed into useful biomarkers of AD prediction.


2022 ◽  
pp. 119-147
Author(s):  
Qingzhong Liu ◽  
Tze-Li Hsu

The detection of different types of forgery manipulation including seam-carving in JPEG images is a hot spot in image forensics. Seam carving was originally designed for content-aware image resizing. It is also being used for forgery manipulation. It is still very challenging to effectively identify the seam carving forgery under recompression. To address the highly challenging detection problems, this chapter introduces an effective approach with large feature mining. Ensemble learning is used to deal with the high dimensionality and to avoid overfitting that may occur with some traditional learning classifier for the detection. The experimental results validate the efficacy of proposed approach to detecting JPEG double compression and exposing the seam-carving forgery while the JPEG recompression is proceeded at the same quality and a lower quality, which is generally much harder for traditional detection methods. The methodology introduced in this chapter provides a strategy and realistic approach to resolve the highly challenging problems in image forensics.


2022 ◽  
Vol 924 (1) ◽  
pp. L10
Author(s):  
Rahul Jayaraman ◽  
Swetlana Hubrig ◽  
Daniel L. Holdsworth ◽  
Markus Schöller ◽  
Silva Järvinen ◽  
...  

Abstract We report the detection and characterization of a new magnetospheric star, HD 135348, based on photometric and spectropolarimetric observations. The TESS light curve of this star exhibited variations consistent with stars known to possess rigidly rotating magnetospheres (RRMs), so we obtained spectropolarimetric observations using the Robert Stobie Spectrograph (RSS) on the South African Large Telescope (SALT) at four different rotational phases. From these observations, we calculated the longitudinal magnetic field of the star 〈B z 〉, as well as the Alfvén and Kepler radii, and deduced that this star contains a centrifugal magnetosphere. However, an archival spectrum does not exhibit the characteristic “double-horned” emission profile for Hα and the Brackett series that has been observed in many other RRM stars. This could be due to the insufficient rotational phase coverage of the available set of observations, as the spectra of these stars significantly vary with the star’s rotation. Our analysis underscores the use of TESS in photometrically identifying magnetic star candidates for spectropolarimetric follow-up using ground-based instruments. We are evaluating the implementation of a machine-learning classifier to search for more examples of RRM stars in TESS data.


2022 ◽  
pp. 201-209
Author(s):  
Umesh Anandrao Patil ◽  
Sanjeev J. Wagh

The medical industry has advanced in a manner where high end technologies are used for early detection and analysis of diseases that are hard to encounter with normal procedures of the medical field. One such disease is diabetic retinopathy (DR) further classified as non-proliferative diabetic retinopathy (NPDR) and proliferative diabetic retinopathy (PDR) conditions. Early detection of NPDR is a challenging task, and it requires examination of fundus images in an amplified manner. To overcome these early detection of DR, the authors propose an automated system that will be using machine learning classifier techniques with combination of convolutional neural network (CNN) to self-train the system and detect the early stages of retinal scans by feature extraction and use of existing retinal scan databases. Hence, the system will eliminate the human flaw of inability to detect early DR in diagnosis and will help us treat the patient in early stages.


Author(s):  
Atichart Sinsongsuk ◽  
Thapana Boonchoo ◽  
Wanida Putthividhya

Map matching deals with matching GPS coordinates to corresponding points or segments on a road network map. The work has various applications in both vehicle navigating and tracking domains. Traditional rule-based approach for solving the Map matching problem yielded great matching results. However, its performance depends on the underlying algorithm and Mathematical/Statistical models employed in the approach. For example, HMM Map Matching yielded O(N2) time complexity, where N is the number of states in the underlying Hidden Markov Model. Map matching techniques with large order of time complexity are impractical for providing services, especially within time-sensitive applications. This is due to their slow responsiveness and the critical amount of computing power required to obtain the results. This paper proposed a novel data-driven approach for projecting GPS trajectory onto a road network. We constructed a supervised-learning classifier using the Multi-Label Classification (MLC) technique and HMM Map Matching results. Analytically, our approach yields O(N) time complexity, suggesting that the approach has a better running performance when applied to the Map matching-based applications in which the response time is the major concern. In addition, our experimental results indicated that we could achieve Jaccard Similarity index of 0.30 and Overlap Coefficient of 0.70.


2021 ◽  
Author(s):  
Sheela J ◽  
Janet B

Abstract This paper proposes a multi-document summarization model using an optimization algorithm named CAVIAR Sun Flower Optimization (CAV-SFO). In this method, two classifiers, namely: Generative Adversarial Network (GAN) classifier and Deep Recurrent Neural Network (Deep RNN), are utilized to generate a score for summarizing multi-documents. Initially, the simHash method is applied for removing the duplicate/real duplicate contents from sentences. Then, the result is given to the proposed CAV-SFO based GAN classifier to determine the score for individual sentences. The CAV-SFO is newly designed by incorporating CAVIAR with Sun Flower Optimization Algorithm (SFO). On the other hand, the pre-processing step is done for duplicate-removed sentences from input multi-document based on stop word removal and stemming. Afterward, text-based features are extracted from pre-processed documents, and then CAV-SFO based Deep RNN is introduced for generating a score; thereby, the internal model parameters are optimally tuned. Finally, the score generated by CAV-SFO based GAN and CAV-SFO based Deep RNN is hybridized, and the final score is obtained using a multi-document compression ratio. The proposed TaylorALO-based GAN showed improved results with maximal precision of 0.989, maximal recall of 0.986, maximal F-Measure of 0.823, maximal Rouge-Precision of 0.930, and maximal Rouge-recall of 0.870.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0260528
Author(s):  
Robyn A. Barbato ◽  
Robert M. Jones ◽  
Michael A. Musty ◽  
Scott M. Slone

Electrogenic bacteria produce power in soil based terrestrial microbial fuel cells (tMFCs) by growing on electrodes and transferring electrons released from the breakdown of substrates. The direction and magnitude of voltage production is hypothesized to be dependent on the available substrates. A sensor technology was developed for compounds indicative of anthropological activity by exposing tMFCs to gasoline, petroleum, 2,4-dinitrotoluene, fertilizer, and urea. A machine learning classifier was trained to identify compounds based on the voltage patterns. After 5 to 10 days, the mean voltage stabilized (+/- 0.5 mV). After the entire incubation, voltage ranged from -59.1 mV to 631.8 mV, with the tMFCs containing urea and gasoline producing the highest (624 mV) and lowest (-9 mV) average voltage, respectively. The machine learning algorithm effectively discerned between gasoline, urea, and fertilizer with greater than 94% accuracy, demonstrating that this technology could be successfully operated as an environmental sensor for change detection.


Mathematics ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 29
Author(s):  
Jersson X. Leon-Medina ◽  
Núria Parés ◽  
Maribel Anaya ◽  
Diego A. Tibaduiza ◽  
Francesc Pozo

The classification and use of robust methodologies in sensor array applications of electronic noses (ENs) remain an open problem. Among the several steps used in the developed methodologies, data preprocessing improves the classification accuracy of this type of sensor. Data preprocessing methods, such as data transformation and data reduction, enable the treatment of data with anomalies, such as outliers and features, that do not provide quality information; in addition, they reduce the dimensionality of the data, thereby facilitating the tasks of a machine learning classifier. To help solve this problem, in this study, a machine learning methodology is introduced to improve signal processing and develop methodologies for classification when an EN is used. The proposed methodology involves a normalization stage to scale the data from the sensors, using both the well-known min−max approach and the more recent mean-centered unitary group scaling (MCUGS). Next, a manifold learning algorithm for data reduction is applied using uniform manifold approximation and projection (UMAP). The dimensionality of the data at the input of the classification machine is reduced, and an extreme learning machine (ELM) is used as a machine learning classifier algorithm. To validate the EN classification methodology, three datasets of ENs were used. The first dataset was composed of 3600 measurements of 6 volatile organic compounds performed by employing 16 metal-oxide gas sensors. The second dataset was composed of 235 measurements of 3 different qualities of wine, namely, high, average, and low, as evaluated by using an EN sensor array composed of 6 different sensors. The third dataset was composed of 309 measurements of 3 different gases obtained by using an EN sensor array of 2 sensors. A 5-fold cross-validation approach was used to evaluate the proposed methodology. A test set consisting of 25% of the data was used to validate the methodology with unseen data. The results showed a fully correct average classification accuracy of 1 when the MCUGS, UMAP, and ELM methods were used. Finally, the effect of changing the number of target dimensions on the reduction of the number of data was determined based on the highest average classification accuracy.


2021 ◽  
Author(s):  
Lijun Yao ◽  
Zhiwei Xu ◽  
Xudong Zhao ◽  
Yang Chen ◽  
Liang Liu ◽  
...  

Abstract Background: Side effects in psychotherapy are sometimes unavoidable. Therapists play a significant role in the side effects of psychotherapy, but there have been few quantitative studies on the mechanisms by which therapists contribute to them. Methods: We designed the Psychotherapy Side Effects Questionnaire-Therapist Version (PSEQ-T) and released it online through an official WeChat account, where 530 therapists participated in the cross-sectional analysis. The therapists were classified into groups with and without perceptions of clients’ side effects. A number of features were selected to distinguish the therapists by category. Six machine learning–based algorithms were selected and trained by our dataset to build classification models. To make the prediction model interpretable, we leveraged the Shapley Additive exPlanations (SHAP) method to quantify the importance of each feature to the therapist categories.Results: Our study demonstrated the following: 1) Of the therapists, 316 perceived the side effects of the clients in the ongoing psychotherapy sessions, with a 59.6% incidence of side effects. Among all 7 perception types of the side effects, the most common type was “make the clients or patients feel bad” (49.8%). 2) A random forest–based machine-learning classifier offered the best predictive performance to distinguish the therapists with and without perceptions of clients’ side effects, with an F1 score of 0.722 and an AUC value of 0.717. 3) When “therapists’ psychological activity” was considered a possible cause of the side effects in psychotherapy by the therapists, it was the most relevant feature for distinguishing the therapist category.Conclusions: Our study revealed that the therapist's mastery of the limitations of psychotherapy technology and theory, especially the awareness and construction of their own psychological states, was the most important factor in predicting the therapist's perception of the side effects of psychotherapy.


Sign in / Sign up

Export Citation Format

Share Document