scholarly journals Perturbation Based Learning for Structured NLP Tasks with Application to Dependency Parsing

2019 ◽  
Vol 7 ◽  
pp. 643-659
Author(s):  
Amichay Doitch ◽  
Ram Yazdi ◽  
Tamir Hazan ◽  
Roi Reichart

The best solution of structured prediction models in NLP is often inaccurate because of limited expressive power of the model or to non-exact parameter estimation. One way to mitigate this problem is sampling candidate solutions from the model’s solution space, reasoning that effective exploration of this space should yield high-quality solutions. Unfortunately, sampling is often computationally hard and many works hence back-off to sub-optimal strategies, such as extraction of the best scoring solutions of the model, which are not as diverse as sampled solutions. In this paper we propose a perturbation-based approach where sampling from a probabilistic model is computationally efficient. We present a learning algorithm for the variance of the perturbations, and empirically demonstrate its importance. Moreover, while finding the argmax in our model is intractable, we propose an efficient and effective approximation. We apply our framework to cross-lingual dependency parsing across 72 corpora from 42 languages and to lightly supervised dependency parsing across 13 corpora from 12 languages, and demonstrate strong results in terms of both the quality of the entire solution list and of the final solution. 1

Author(s):  
Berni Guerrero-Calderón ◽  
Maximilian Klemp ◽  
José Alfonso Morcillo ◽  
Daniel Memmert

The aim of this study was to examine whether match physical output can be predicted from the workload applied in training by professional soccer players. Training and match load records from two professional soccer teams belonging to the Spanish First and Second Division were collected through GPS technology over a season ( N = 1678 and N = 2441 records, respectively). The factors playing position, season period, quality of opposition, category and playing formation were considered into the analysis. The level of significance was set at p ≤ .05. The prediction models yielded a conditional R-squared in match of 0.51 in total distance (TD); 0.58 in high-intensity distance (HIRD, from 14 to 24 km · h−1); and 0.60 in sprint distance (SPD, >24 km·h−1). The main finding of this study was that the physical output of players in the match was predicted from the training-load performed during the previous training week. The training-TD negatively affected the match physical output while the training-HIRD showed a positive effect. Moreover, the contextual factors – playing position, season period, division and quality of opposition – affected the players’ physical output in the match. Therefore, these results suggest the appropriateness of programming lower training volume but increasing the intensity of the activity throughout the weekly microcycle, and considering contextual factors within the load programming.


2020 ◽  
Vol 36 (6) ◽  
pp. 439-442
Author(s):  
Alissa Jell ◽  
Christina Kuttler ◽  
Daniel Ostler ◽  
Norbert Hüser

<b><i>Introduction:</i></b> Esophageal motility disorders have a severe impact on patients’ quality of life. While high-resolution manometry (HRM) is the gold standard in the diagnosis of esophageal motility disorders, intermittently occurring muscular deficiencies often remain undiscovered if they do not lead to an intense level of discomfort or cause suffering in patients. Ambulatory long-term HRM allows us to study the circadian (dys)function of the esophagus in a unique way. With the prolonged examination period of 24 h, however, there is an immense increase in data which requires personnel and time for evaluation not available in clinical routine. Artificial intelligence (AI) might contribute here by performing an autonomous analysis. <b><i>Methods:</i></b> On the basis of 40 previously performed and manually tagged long-term HRM in patients with suspected temporary esophageal motility disorders, we implemented a supervised machine learning algorithm for automated swallow detection and classification. <b><i>Results:</i></b> For a set of 24 h of long-term HRM by means of this algorithm, the evaluation time could be reduced from 3 days to a core evaluation time of 11 min for automated swallow detection and clustering plus an additional 10–20 min of evaluation time, depending on the complexity and diversity of motility disorders in the examined patient. In 12.5% of patients with suggested esophageal motility disorders, AI-enabled long-term HRM was able to reveal new and relevant findings for subsequent therapy. <b><i>Conclusion:</i></b> This new approach paves the way to the clinical use of long-term HRM in patients with temporary esophageal motility disorders and might serve as an ideal and clinically relevant application of AI.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Thibault Asselborn ◽  
Wafa Johal ◽  
Bolat Tleubayev ◽  
Zhanel Zhexenova ◽  
Pierre Dillenbourg ◽  
...  

AbstractDo handwriting skills transfer when a child writes in two different scripts, such as the Latin and Cyrillic alphabets? Are our measures of handwriting skills intrinsically bound to one alphabet or will a child who faces handwriting difficulties in one script experience similar difficulties in the other script? To answer these questions, 190 children from grades 1–4 were asked to copy a short text using both the Cyrillic and Latin alphabets on a digital tablet. A recent change of policy in Kazakhstan gave us an opportunity to measure transfer, as the Latin-based Kazakh alphabet has not yet been introduced. Therefore, pupils in grade 1 had a 6-months experience in Cyrillic, and pupils in grades 2, 3, and 4 had 1.5, 2.5, and 3.5 years of experience in Cyrillic, respectively. This unique situation created a quasi-experimental situation that allowed us to measure the influence of the number of years spent practicing Cyrillic on the quality of handwriting in the Latin alphabet. The results showed that some of the differences between the two scripts were constant across all grades. These differences thus reflect the intrinsic differences in the handwriting dynamics between the two alphabets. For instance, several features related to the pen pressure on the tablet are quite different. Other features, however, revealed decreasing differences between the two scripts across grades. While we found that the quality of Cyrillic writing increased from grades 1–4, due to increased practice, we also found that the quality of the Latin writing increased as well, despite the fact that all of the pupils had the same absence of experience in writing in Latin. We can therefore interpret this improvement in Latin script as an indicator of the transfer of fine motor control skills from Cyrillic to Latin. This result is especially surprising given that one could instead hypothesize a negative transfer, i.e., that the finger controls automated for one alphabet would interfere with those required by the other alphabet. One interesting side-effect of these findings is that the algorithms that we developed for the diagnosis of handwriting difficulties among French-speaking children could be relevant for other alphabets, paving the way for the creation of a cross-lingual model for the detection of handwriting difficulties.


Author(s):  
Jie Yuan ◽  
Yuan Ji ◽  
Zhou Zhu ◽  
Liya Huang ◽  
Junfeng Qian ◽  
...  

In order to solve the problems of large error and low performance of traditional progressive image model matching information checking methods, an automatic progressive image model matching information checking method based on machine learning is proposed. The generation method of progressive image is analyzed, and the target image sample is obtained. On this basis, machine learning algorithm is used to segment progressive image samples. In each image segmentation part, crawler technology is used to automatically collect progressive image model matching information, and under the constraint of image model matching information checking standard, automatic checking of progressive image model matching information is realized from geometric structure, image content and other aspects. Experimental results show that the verification error of the design method is reduced by 0.687 Mb, and the quality of progressive image is improved.


2021 ◽  
pp. 1-38
Author(s):  
Hailie Suk ◽  
Ayushi Sharma ◽  
Anand Balu Nellippallil ◽  
Ashok Das ◽  
John Hall

Abstract The quality of life (QOL) in rural communities is improved through electrification. Microgrids can provide electricity in areas where grid access to electricity is infeasible. Still, insufficient power capacity hinders the very progress that microgrids promote. Therefore, we propose a decision-making framework to manage power distribution based on its impact on the rural QOL. Parameters are examined in this paper to represent the QOL pertaining to water, safety, education, and leisure/social activities. Each parameter is evaluated based on condition, community importance, and energy dependence. A solution for power allocation is developed by executing the compromise decision support problem (cDSP) and exploring the solution space. Energy loads, such as those required for powering water pumps, streetlamps, and household devices are prioritized in the context of the QOL. The technique also allows decision-makers to update the power distribution scheme as the dynamics between energy production and demand change over time. In this paper, we propose a framework for connecting QOL and power management. The flexibility of the approach is demonstrated using a problem with varying scenarios that may be time-dependent. The work enables sustainable energy solutions that can evolve with community development.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-21
Author(s):  
Changsen Yuan ◽  
Heyan Huang ◽  
Chong Feng

The Graph Convolutional Network (GCN) is a universal relation extraction method that can predict relations of entity pairs by capturing sentences’ syntactic features. However, existing GCN methods often use dependency parsing to generate graph matrices and learn syntactic features. The quality of the dependency parsing will directly affect the accuracy of the graph matrix and change the whole GCN’s performance. Because of the influence of noisy words and sentence length in the distant supervised dataset, using dependency parsing on sentences causes errors and leads to unreliable information. Therefore, it is difficult to obtain credible graph matrices and relational features for some special sentences. In this article, we present a Multi-Graph Cooperative Learning model (MGCL), which focuses on extracting the reliable syntactic features of relations by different graphs and harnessing them to improve the representations of sentences. We conduct experiments on a widely used real-world dataset, and the experimental results show that our model achieves the state-of-the-art performance of relation extraction.


2021 ◽  
pp. postgradmedj-2020-139352
Author(s):  
Simon Allan ◽  
Raphael Olaiya ◽  
Rasan Burhan

Cardiovascular disease (CVD) is one of the leading causes of death across the world. CVD can lead to angina, heart attacks, heart failure, strokes, and eventually, death; among many other serious conditions. The early intervention with those at a higher risk of developing CVD, typically with statin treatment, leads to better health outcomes. For this reason, clinical prediction models (CPMs) have been developed to identify those at a high risk of developing CVD so that treatment can begin at an earlier stage. Currently, CPMs are built around statistical analysis of factors linked to developing CVD, such as body mass index and family history. The emerging field of machine learning (ML) in healthcare, using computer algorithms that learn from a dataset without explicit programming, has the potential to outperform the CPMs available today. ML has already shown exciting progress in the detection of skin malignancies, bone fractures and many other medical conditions. In this review, we will analyse and explain the CPMs currently in use with comparisons to their developing ML counterparts. We have found that although the newest non-ML CPMs are effective, ML-based approaches consistently outperform them. However, improvements to the literature need to be made before ML should be implemented over current CPMs.


2018 ◽  
Vol 8 (12) ◽  
pp. 2416 ◽  
Author(s):  
Ansi Zhang ◽  
Honglei Wang ◽  
Shaobo Li ◽  
Yuxin Cui ◽  
Zhonghao Liu ◽  
...  

Prognostics, such as remaining useful life (RUL) prediction, is a crucial task in condition-based maintenance. A major challenge in data-driven prognostics is the difficulty of obtaining a sufficient number of samples of failure progression. However, for traditional machine learning methods and deep neural networks, enough training data is a prerequisite to train good prediction models. In this work, we proposed a transfer learning algorithm based on Bi-directional Long Short-Term Memory (BLSTM) recurrent neural networks for RUL estimation, in which the models can be first trained on different but related datasets and then fine-tuned by the target dataset. Extensive experimental results show that transfer learning can in general improve the prediction models on the dataset with a small number of samples. There is one exception that when transferring from multi-type operating conditions to single operating conditions, transfer learning led to a worse result.


Sign in / Sign up

Export Citation Format

Share Document