scholarly journals Inference of an Integrative, Executable Network for Rheumatoid Arthritis Combining Data-Driven Machine Learning Approaches and a State-of-the-Art Mechanistic Disease Map

2021 ◽  
Vol 11 (8) ◽  
pp. 785
Author(s):  
Quentin Miagoux ◽  
Vidisha Singh ◽  
Dereck de Mézquita ◽  
Valerie Chaudru ◽  
Mohamed Elati ◽  
...  

Rheumatoid arthritis (RA) is a multifactorial, complex autoimmune disease that involves various genetic, environmental, and epigenetic factors. Systems biology approaches provide the means to study complex diseases by integrating different layers of biological information. Combining multiple data types can help compensate for missing or conflicting information and limit the possibility of false positives. In this work, we aim to unravel mechanisms governing the regulation of key transcription factors in RA and derive patient-specific models to gain more insights into the disease heterogeneity and the response to treatment. We first use publicly available transcriptomic datasets (peripheral blood) relative to RA and machine learning to create an RA-specific transcription factor (TF) co-regulatory network. The TF cooperativity network is subsequently enriched in signalling cascades and upstream regulators using a state-of-the-art, RA-specific molecular map. Then, the integrative network is used as a template to analyse patients’ data regarding their response to anti-TNF treatment and identify master regulators and upstream cascades affected by the treatment. Finally, we use the Boolean formalism to simulate in silico subparts of the integrated network and identify combinations and conditions that can switch on or off the identified TFs, mimicking the effects of single and combined perturbations.

2021 ◽  
Author(s):  
Quentin Miagoux ◽  
Dereck de Mezquita ◽  
Vidisha Singh ◽  
Valerie Chaudru ◽  
Mohamed Elati ◽  
...  

MotivationRheumatoid arthritis (RA) is a multifactorial autoimmune disease that causes chronic inflammation of the joints. RA is considered a complex disease as it involves various genetic, environmental, and epigenetic factors. Systems biology approaches provide the means to study complex diseases by integrating different layers of biological information. Combining multiple data types can help compensate for missing or conflicting information and limit the possibility of false positives. In this approach, we integrate three different biological layers (gene expression, signalling cascades, mutations), obtained by bottom-up and top-down methods to build an integrative, disease-specific network. The goal behind this endeavour is to see if we can unravel mechanisms governing the regulation of key genes identified as mutation carriers in RA and derive patient-specific models to gain more insights into the disease heterogeneity.ResultsIn this work, we combine biological data relevant to Rheumatoid Arthritis, in the form of a global, integrative network. We first make use of publicly available transcriptomic datasets (peripheral blood) relative to RA and machine learning to create an RA specific transcription factor (TF) co-regulatory network. The TF cooperativity network is subsequently enriched in signalling cascades and upstream regulators using prior knowledge encoded in a state-of-the-art, RA-specific molecular map. Lastly, a list of RA specific variants highlights key genes associated with known disease mutations.AvailabilityDatasets used for the analysis are publicly available. All scripts used to generate results and the Shiny app will be freely accessible after peer-reviewed publication.


2020 ◽  
pp. 1-21 ◽  
Author(s):  
Clément Dalloux ◽  
Vincent Claveau ◽  
Natalia Grabar ◽  
Lucas Emanuel Silva Oliveira ◽  
Claudia Maria Cabral Moro ◽  
...  

Abstract Automatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.


2020 ◽  
Vol 14 (1) ◽  
pp. 151-178
Author(s):  
Luca Oneto

 Machine learning based systems and products are reaching society at large in many aspects of everyday life, including financial lending, online advertising, pretrial and immigration detention, child maltreatment screening, health care, social services, and education. This phenomenon has been accompanied by an increase in concern about the ethical issues that may rise from the adoption of these technologies. In response to this concern, a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. The central question is how to ensure that the learned model does not treat subgroups in the population unfairly. While the design of solutions to this issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm. In this work, we will describe the state of the art on algorithmic fairness using statistical learning theory, machine learning, and deep learning approaches that are able to learn fair models and data representation.


2018 ◽  
Vol 6 ◽  
pp. 343-356 ◽  
Author(s):  
Egoitz Laparra ◽  
Dongfang Xu ◽  
Steven Bethard

This paper presents the first model for time normalization trained on the SCATE corpus. In the SCATE schema, time expressions are annotated as a semantic composition of time entities. This novel schema favors machine learning approaches, as it can be viewed as a semantic parsing task. In this work, we propose a character level multi-output neural network that outperforms previous state-of-the-art built on the TimeML schema. To compare predictions of systems that follow both SCATE and TimeML, we present a new scoring metric for time intervals. We also apply this new metric to carry out a comparative analysis of the annotations of both schemes in the same corpus.


Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 2075
Author(s):  
Óscar Apolinario-Arzube ◽  
José Antonio García-Díaz ◽  
José Medina-Moreira ◽  
Harry Luna-Aveiga ◽  
Rafael Valencia-García

Automatic satire identification can help to identify texts in which the intended meaning differs from the literal meaning, improving tasks such as sentiment analysis, fake news detection or natural-language user interfaces. Typically, satire identification is performed by training a supervised classifier for finding linguistic clues that can determine whether a text is satirical or not. For this, the state-of-the-art relies on neural networks fed with word embeddings that are capable of learning interesting characteristics regarding the way humans communicate. However, as far as our knowledge goes, there are no comprehensive studies that evaluate these techniques in Spanish in the satire identification domain. Consequently, in this work we evaluate several deep-learning architectures with Spanish pre-trained word-embeddings and compare the results with strong baselines based on term-counting features. This evaluation is performed with two datasets that contain satirical and non-satirical tweets written in two Spanish variants: European Spanish and Mexican Spanish. Our experimentation revealed that term-counting features achieved similar results to deep-learning approaches based on word-embeddings, both outperforming previous results based on linguistic features. Our results suggest that term-counting features and traditional machine learning models provide competitive results regarding automatic satire identification, slightly outperforming state-of-the-art models.


Author(s):  
Minsik Oh ◽  
Sungjoon Park ◽  
Sun Kim ◽  
Heejoon Chae

Abstract Gene expressions are subtly regulated by quantifiable measures of genetic molecules such as interaction with other genes, methylation, mutations, transcription factor and histone modifications. Integrative analysis of multi-omics data can help scientists understand the condition or patient-specific gene regulation mechanisms. However, analysis of multi-omics data is challenging since it requires not only the analysis of multiple omics data sets but also mining complex relations among different genetic molecules by using state-of-the-art machine learning methods. In addition, analysis of multi-omics data needs quite large computing infrastructure. Moreover, interpretation of the analysis results requires collaboration among many scientists, often requiring reperforming analysis from different perspectives. Many of the aforementioned technical issues can be nicely handled when machine learning tools are deployed on the cloud. In this survey article, we first survey machine learning methods that can be used for gene regulation study, and we categorize them according to five different goals: gene regulatory subnetwork discovery, disease subtype analysis, survival analysis, clinical prediction and visualization. We also summarize the methods in terms of multi-omics input types. Then, we explain why the cloud is potentially a good solution for the analysis of multi-omics data, followed by a survey of two state-of-the-art cloud systems, Galaxy and BioVLAB. Finally, we discuss important issues when the cloud is used for the analysis of multi-omics data for the gene regulation study.


2021 ◽  
Vol 13 (22) ◽  
pp. 12613
Author(s):  
Najihah Ahmad Latif ◽  
Fatini Nadhirah Mohd Nain ◽  
Nurul Hashimah Ahamed Hassain Malim ◽  
Rosni Abdullah ◽  
Muhammad Farid Abdul Rahim ◽  
...  

Oil palm is one of the main crops grown to help achieve sustainability in Malaysia. The selection of the best breeds will produce quality crops and increase crop yields. This study aimed to examine machine learning (ML) in oil palm breeding (OPB) using factors other than genetic data. A new conceptual framework to adopt the ML in OPB will be presented at the end of this paper. At first, data types, phenotype traits, current ML models, and evaluation technique will be identified through a literature survey. This study found that the phenotype and genotype data are widely used in oil palm breeding programs. The average bunch weight, bunch number, and fresh fruit bunch are the most important characteristics that can influence the genetic improvement of progenies. Although machine learning approaches have been applied to increase the productivity of the crop, most studies focus on molecular markers or genotypes for plant breeding, rather than on phenotype. Theoretically, the use of phenotypic data related to offspring should predict high breeding values by using ML. Therefore, a new ML conceptual framework to study the phenotype and progeny data of oil palm breeds will be discussed in relation to achieving the Sustainable Development Goals (SDGs).


2016 ◽  
Author(s):  
Michael P. Pound ◽  
Alexandra J. Burgess ◽  
Michael H. Wilson ◽  
Jonathan A. Atkinson ◽  
Marcus Griffiths ◽  
...  

AbstractDeep learning is an emerging field that promises unparalleled results on many data analysis problems. We show the success offered by such techniques when applied to the challenging problem of image-based plant phenotyping, and demonstrate state-of-the-art results for root and shoot feature identification and localisation. We predict a paradigm shift in image-based phenotyping thanks to deep learning approaches.


Author(s):  
Sanam Narejo ◽  
Eros Pasero ◽  
Farzana Kulsoom

<p>A Brain-Computer Interface (BCI) provides an alternative communication interface between the human brain and a computer. The Electroencephalogram (EEG) signals are acquired, processed and machine learning algorithms are further applied to extract useful information.  During  EEG acquisition,   artifacts  are induced due to involuntary eye movements or eye blink, casting adverse effects  on system performance. The aim of this research is to predict eye states from EEG signals using Deep learning architectures and present improved classifier models. Recent studies reflect that Deep Neural Networks are trending state of the art Machine learning approaches. Therefore, the current work presents the implementation of  Deep Belief Network (DBN) and Stacked AutoEncoders (SAE) as Classifiers with encouraging performance accuracy.  One of the designed  SAE models outperforms the  performance of DBN and the models presented in existing research by an impressive error rate of 1.1% on the test set bearing accuracy of 98.9%. The findings in this study,  may provide a contribution towards the state of  the  art performance on the problem of  EEG based eye state classification.</p>


Rheumatology ◽  
2020 ◽  
Vol 59 (Supplement_2) ◽  
Author(s):  
Mateusz Maciejewski ◽  
Caroline Sands ◽  
Nisha Nair ◽  
Stephanie Ling ◽  
Suzanne Verstappen ◽  
...  

Abstract Background For patients with rheumatoid arthritis (RA), introduction of early, effective therapy has consistently been shown to improve long-term outcomes. Low-dose methotrexate (MTX) is commonly prescribed as first-line treatment for RA. However, MTX is not effective for a large minority of patients and there is currently no way to determine ahead of therapy which patients are most likely to benefit. Metabolomics and lipidomics are emerging approaches for studying patient stratification in RA and have the potential to identify disease processes that underpin treatment outcomes. Here we apply state-of-the-art machine learning algorithms to predict MTX treatment response, by testing serum lipid levels measured at two time-points (pre-treatment and following 4 weeks on drug) to predict MTX response by 6 months. Methods This study included patients from the Rheumatoid Arthritis Medication Study (RAMS), a UK multi-centre one-year prospective observational study investigating predictors of response to MTX in patients with RA. Since 2008, patients who are about to start MTX for the first time are asked to provide demographic and clinical data, as well as blood samples to permit DNA, RNA and serum-based biomarker studies. Patients about to commence MTX treatment were followed longitudinally and those categorised as good or non-responders following 6 months on-drug using EULAR response criteria were analysed. Serum lipid levels were measured at pre-treatment and following 4 weeks on drug using ultra-performance liquid chromatography tailored for complex lipid analysis, coupled to mass spectrometry. State-of-the-art supervised machine learning methods were then applied to predict EULAR response at 6 months. Models including lipid levels were compared to models including clinical covariates (including: MTX start dose, steroid use at inclusion, BMI, number of swollen joints, number of tender joints, CRP levels, patients’ assessment of their overall wellbeing, gender, age-at-inclusion, age-at-onset, disease duration, HAQ score and pre-treatment smoking habits). Results Following quality control, 3,366 features (1,060 in negatively-charged mode and 2,306 in positive mode) were available for analysis at pre-treatment and 4 weeks from 100 RA patients categorised as good (GR, n = 50) or poor (NR, n = 50) responders to MTX following 6 months on drug. The best model performance for the classifier including clinical covariates was observed using L1/L2-regularised logistic regression (ROC AUC 0.68 ± 0.02). However, the clinical covariate model outperformed the classifier including lipid levels when either pre- or on-treatment time-points were investigated (ROC AUC 0.61 ± 0.02). Conclusion These data do not support the utility of early treatment lipidomic monitoring in routine clinical practice in patients started on MTX for their RA. Disclosures M. Maciejewski: Shareholder/stock ownership; owns stock or stock options in Pfizer. C. Sands None. N. Nair None. S. Ling None. S. Verstappen None. K. Hyrich None. A. Barton None. D. Ziemek Shareholder/stock ownership; owns stock or stock options in Pfizer. M. Lewis None. D. Plant None.


Sign in / Sign up

Export Citation Format

Share Document