Multi-hop assortativities for network classification

Leonardo Gutiérrez-Gómez; Jean-Charles Delvenne

doi:10.1093/comnet/cny034

Multi-hop assortativities for network classification

Journal of Complex Networks ◽

10.1093/comnet/cny034 ◽

2018 ◽

Vol 7 (4) ◽

pp. 603-622 ◽

Cited By ~ 1

Author(s):

Leonardo Gutiérrez-Gómez ◽

Jean-Charles Delvenne

Keyword(s):

Machine Learning ◽

Scientific Collaboration ◽

State Of The Art ◽

Medical Engineering ◽

Research Field ◽

Classification Task ◽

Collaboration Network ◽

Structural Patterns ◽

Art Methods

Abstract Several social, medical, engineering and biological challenges rely on discovering the functionality of networks from their structure and node metadata, when it is available. For example, in chemoinformatics one might want to detect whether a molecule is toxic based on structure and atomic types, or discover the research field of a scientific collaboration network. Existing techniques rely on counting or measuring structural patterns that are known to show large variations from network to network, such as the number of triangles, or the assortativity of node metadata. We introduce the concept of multi-hop assortativity, that captures the similarity of the nodes situated at the extremities of a randomly selected path of a given length. We show that multi-hop assortativity unifies various existing concepts and offers a versatile family of ‘fingerprints’ to characterize networks. These fingerprints allow in turn to recover the functionalities of a network, with the help of the machine learning toolbox. Our method is evaluated empirically on established social and chemoinformatic network benchmarks. Results reveal that our assortativity based features are competitive providing highly accurate results often outperforming state of the art methods for the network classification task.

Download Full-text

Power Quality: Scientific Collaboration Networks and Research Trends

Energies ◽

10.3390/en11082067 ◽

2018 ◽

Vol 11 (8) ◽

pp. 2067 ◽

Cited By ~ 13

Author(s):

Francisco Montoya ◽

Raul Baños ◽

Alfredo Alcayde ◽

Maria Montoya ◽

Francisco Manzano-Agugliaro

Keyword(s):

Power Quality ◽

Scientific Collaboration ◽

Research Field ◽

Research Trends ◽

Collaboration Network ◽

Collaboration Networks ◽

Different Types ◽

Voltage Frequency ◽

Advanced Model ◽

Proper Operation

Power quality is a research field related to the proper operation of devices and technological equipment in industry, service, and domestic activities. The level of power quality is determined by variations in voltage, frequency, and waveforms with respect to reference values. These variations correspond to different types of disturbances, including power fluctuations, interruptions, and transients. Several studies have been focused on analysing power quality issues. However, there is a lack of studies on the analysis of both the trending topics and the scientific collaboration network underlying the field of power quality. To address these aspects, an advanced model is used to retrieve data from publications related to power quality and analyse this information using a graph visualisation software and statistical tools. The results suggest that research interests are mainly focused on the analysis of power quality problems and mitigation techniques. Furthermore, they are observed important collaboration networks between researchers within and across countries.

Download Full-text

Comparative Quality Estimation for Machine Translation Observations on Machine Learning and Features

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0029 ◽

2017 ◽

Vol 108 (1) ◽

pp. 307-318 ◽

Cited By ~ 1

Author(s):

Eleftherios Avramidis

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Machine Translation ◽

State Of The Art ◽

Linear Method ◽

The State ◽

Quality Estimation ◽

Art Methods ◽

Improved Performance

AbstractA deeper analysis on Comparative Quality Estimation is presented by extending the state-of-the-art methods with adequacy and grammatical features from other Quality Estimation tasks. The previously used linear method, unable to cope with the augmented features, is replaced with a boosting classifier assisted by feature selection. The methods indicated show improved performance for 6 language pairs, when applied on the output from MT systems developed over 7 years. The improved models compete better with reference-aware metrics.Notable conclusions are reached through the examination of the contribution of the features in the models, whereas it is possible to identify common MT errors that are captured by the features. Many grammatical/fluency features have a good contribution, few adequacy features have some contribution, whereas source complexity features are of no use. The importance of many fluency and adequacy features is language-specific.

Download Full-text

Machine Learning-Based State-of-the-Art Methods for the Classification of RNA-Seq Data

Lecture Notes in Computational Vision and Biomechanics - Classification in BioApps ◽

10.1007/978-3-319-65981-7_6 ◽

2017 ◽

pp. 133-172 ◽

Cited By ~ 7

Author(s):

Almas Jabeen ◽

Nadeem Ahmad ◽

Khalid Raza

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Rna Seq ◽

Art Methods

Download Full-text

Predict COVID-19 Spreading With C-SMOTE

Business Information Systems ◽

10.52825/bis.v1i.45 ◽

2021 ◽

pp. 27-38

Author(s):

Alessio Bernardo ◽

Emanuele Della Valle

Keyword(s):

Machine Learning ◽

State Of The Art ◽

High Impact ◽

Statistical Evidence ◽

The Other ◽

Classification Algorithms ◽

Minority Class ◽

Art Methods ◽

Concept Drifts

Data continuously gathered monitoring the spreading of the COVID-19 pandemic form an unbounded flow of data. Accurately forecasting if the infections will increase or decrease has a high impact, but it is challenging because the pandemic spreads and contracts periodically. Technically, the flow of data is said to be imbalanced and subject to concept drifts because signs of decrements are the minority class during the spreading periods, while they become the majority class in the contraction periods and the other way round. In this paper, we propose a case study applying the Continuous Synthetic Minority Oversampling Technique (C-SMOTE), a novel meta-strategy to pipeline with Streaming Machine Learning (SML) classification algorithms, to forecast the COVID-19 pandemic trend. Benchmarking SML pipelinesthat use C-SMOTE against state-of-the-art methods on a COVID-19 dataset, we bring statistical evidence that models learned using C-SMOTE are better.

Download Full-text

Statistical-Hypothesis-Aided Tests for Epilepsy Classification

Computers ◽

10.3390/computers8040084 ◽

2019 ◽

Vol 8 (4) ◽

pp. 84 ◽

Cited By ~ 1

Author(s):

Alaa Alqatawneh ◽

Rania Alhalaseh ◽

Ahmad Hassanat ◽

Mohammad Abbadi

Keyword(s):

State Of The Art ◽

Statistical Tests ◽

Extraction Process ◽

Classification Task ◽

Statistical Hypothesis ◽

Art Methods ◽

Epilepsy Classification ◽

The Many ◽

Electroencephalogram Eeg ◽

Similar Task

In this paper, an efficient, accurate, and nonparametric epilepsy detection and classification approach based on electroencephalogram (EEG) signals is proposed. The proposed approach mainly depends on a feature extraction process that is conducted using a set of statistical tests. Among the many existing tests, those fit with processed data and for the purpose of the proposed approach were used. From each test, various output scalars were extracted and used as features in the proposed detection and classification task. Experiments that were conducted on the basis of a Bonn University dataset showed that the proposed approach had very accurate results ( 98.4 % ) in the detection task and outperformed state-of-the-art methods in a similar task on the same dataset. The proposed approach also had accurate results ( 94.0 % ) in the classification task, but it did not outperform state-of-the-art methods in a similar task on the same dataset. However, the proposed approach had less time complexity in comparison with those methods that achieved better results.

Download Full-text

Nonuniform language in technical writing: Detection and correction

Natural Language Engineering ◽

10.1017/s1351324920000133 ◽

2020 ◽

pp. 1-22

Author(s):

Weibo Wang ◽

Aminul Islam ◽

Abidalrahman Moh’d ◽

Axel J. Soto ◽

Evangelos E. Milios

Keyword(s):

Machine Learning ◽

Technical Writing ◽

State Of The Art ◽

Classification Method ◽

Technical Document ◽

Writing Style ◽

Machine Learning Classification ◽

Text Readability ◽

Art Methods ◽

Language Detection

Abstract Technical writing in professional environments, such as user manual authoring, requires the use of uniform language. Nonuniform language refers to sentences in a technical document that are intended to have the same meaning within a similar context, but use different words or writing style. Addressing this nonuniformity problem requires the performance of two tasks. The first task, which we named nonuniform language detection (NLD), is detecting such sentences. We propose an NLD method that utilizes different similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different features are extracted and integrated by applying a machine learning classification method. The second task, which we named nonuniform language correction (NLC), is deciding which sentence among the detected ones is more appropriate for that context. To address this problem, we propose an NLC method that combines contraction removal, near-synonym choice, and text readability comparison. We tested our methods using smartphone user manuals. We finally compared our methods against state-of-the-art methods in paraphrase detection (for NLD) and against expert annotators (for both NLD and NLC). The experiments demonstrate that the proposed methods achieve performance that matches expert annotators.

Download Full-text

A Novel Approach Based on Point Cut Set to Predict Associations of Diseases and LncRNAs

Current Bioinformatics ◽

10.2174/1574893613666181026122045 ◽

2019 ◽

Vol 14 (4) ◽

pp. 333-343 ◽

Cited By ~ 3

Author(s):

Linai Kuang ◽

Haochen Zhao ◽

Lei Wang ◽

Zhanwei Xuan ◽

Tingrui Pei

Keyword(s):

Cross Validation ◽

State Of The Art ◽

Interaction Network ◽

Research Field ◽

Computational Method ◽

Difference Matrix ◽

Art Methods ◽

Disease Associations ◽

Cut Set ◽

Fold Cross Validation

Background: In recent years, more evidence have progressively indicated that Long non-coding RNAs (lncRNAs) play vital roles in wide-ranging human diseases, which can serve as potential biomarkers and drug targets. Comparing with vast lncRNAs being found, the relationships between lncRNAs and diseases remain largely unknown. Objective: The prediction of novel and potential associations between lncRNAs and diseases would contribute to dissect the complex mechanisms of disease pathogenesis. associations while known disease-lncRNA associations are required only. Method: In this paper, a new computational method based on Point Cut Set is proposed to predict LncRNA-Disease Associations (PCSLDA) based on known lncRNA-disease associations. Compared with the existing state-of-the-art methods, the major novelty of PCSLDA lies in the incorporation of distance difference matrix and point cut set to set the distance correlation coefficient of nodes in the lncRNA-disease interaction network. Hence, PCSLDA can be applied to forecast potential lncRNAdisease associations while known disease-lncRNA associations are required only. Results: Simulation results show that PCSLDA can significantly outperform previous state-of-the-art methods with reliable AUC of 0.8902 in the leave-one-out cross-validation and AUCs of 0.7634 and 0.8317 in 5-fold cross-validation and 10-fold cross-validation respectively. And additionally, 70% of top 10 predicted cancer-lncRNA associations can be confirmed. Conclusion: It is anticipated that our proposed model can be a great addition to the biomedical research field.

Download Full-text

BULNER: BUg Localization with word embeddings and NEtwork Regularization

10.5753/vem.2019.7580 ◽

2019 ◽

Author(s):

Jacson Rodrigues Barbosa ◽

Ricardo Marcondes Marcacini ◽

Ricardo Britto ◽

Frederico Soares ◽

Solange Rezende ◽

...

Keyword(s):

Machine Learning ◽

Information Retrieval ◽

State Of The Art ◽

Word Embeddings ◽

Bug Localization ◽

Preliminary Results ◽

Bug Report ◽

Art Methods ◽

Software Engineers

Bug localization (BL) from the bug report is the strategic activity of the software maintaining process. Because BL is a costly and tedious activity, BL techniques information retrieval-based and machine learning-based could aid software engineers. We propose a method for BUg Localization with word embeddings and Network Regularization (BULNER). The preliminary results suggest that BULNER has better performance than two state-of-the-art methods.

Download Full-text

Health Informatics Education in the Third World

Methods of Information in Medicine ◽

10.1055/s-0038-1636790 ◽

1989 ◽

Vol 28 (04) ◽

pp. 270-272 ◽

Cited By ~ 5

Author(s):

O. Rienhoff

Keyword(s):

Developing Countries ◽

Health Informatics ◽

Third World ◽

State Of The Art ◽

The State ◽

Collaboration Network ◽

Educational Tools ◽

The Third ◽

The Third World

Abstract:The state of the art is summarized showing many efforts but only few results which can serve as demonstration examples for developing countries. Education in health informatics in developing countries is still mainly dealing with the type of health informatics known from the industrialized world. Educational tools or curricula geared to the matter of development are rarely to be found. Some WHO activities suggest that it is time for a collaboration network to derive tools and curricula within the next decade.

Download Full-text

Multiple vehicles detection and tracking for intelligent transport systems using machine learning approaches

Transport and Communication Science Journal ◽

10.25073/tcsj.70.3.7 ◽

2019 ◽

Vol 70 (3) ◽

pp. 214-224

Author(s):

Bui Ngoc Dung ◽

Manh Dzung Lai ◽

Tran Vu Hieu ◽

Nguyen Binh T. H.

Keyword(s):

Machine Learning ◽

Gaussian Mixture ◽

Research Field ◽

Transport Systems ◽

Learning Approaches ◽

Subtraction Method ◽

Intelligent Transport Systems ◽

Intelligent Transport ◽

Detection And Tracking ◽

Multiple Vehicles

Video surveillance is emerging research field of intelligent transport systems. This paper presents some techniques which use machine learning and computer vision in vehicles detection and tracking. Firstly the machine learning approaches using Haar-like features and Ada-Boost algorithm for vehicle detection are presented. Secondly approaches to detect vehicles using the background subtraction method based on Gaussian Mixture Model and to track vehicles using optical flow and multiple Kalman filters were given. The method takes advantages of distinguish and tracking multiple vehicles individually. The experimental results demonstrate high accurately of the method.

Download Full-text