scholarly journals Biomedical document triage using a hierarchical attention-based capsule network

2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Jian Wang ◽  
Mengying Li ◽  
Qishuai Diao ◽  
Hongfei Lin ◽  
Zhihao Yang ◽  
...  

Abstract Background Biomedical document triage is the foundation of biomedical information extraction, which is important to precision medicine. Recently, some neural networks-based methods have been proposed to classify biomedical documents automatically. In the biomedical domain, documents are often very long and often contain very complicated sentences. However, the current methods still find it difficult to capture important features across sentences. Results In this paper, we propose a hierarchical attention-based capsule model for biomedical document triage. The proposed model effectively employs hierarchical attention mechanism and capsule networks to capture valuable features across sentences and construct a final latent feature representation for a document. We evaluated our model on three public corpora. Conclusions Experimental results showed that both hierarchical attention mechanism and capsule networks are helpful in biomedical document triage task. Our method proved itself highly competitive or superior compared with other state-of-the-art methods.

2020 ◽  
Vol 29 (15) ◽  
pp. 2050250
Author(s):  
Xiongfei Liu ◽  
Bengao Li ◽  
Xin Chen ◽  
Haiyan Zhang ◽  
Shu Zhan

This paper proposes a novel method for person image generation with arbitrary target pose. Given a person image and an arbitrary target pose, our proposed model can synthesize images with the same person but different poses. The Generative Adversarial Networks (GANs) are the major part of the proposed model. Different from the traditional GANs, we add attention mechanism to the generator in order to generate realistic-looking images, we also use content reconstruction with a pretrained VGG16 Net to keep the content consistency between generated images and target images. Furthermore, we test our model on DeepFashion and Market-1501 datasets. The experimental results show that the proposed network performs favorably against state-of-the-art methods.


2020 ◽  
pp. 1-21 ◽  
Author(s):  
Clément Dalloux ◽  
Vincent Claveau ◽  
Natalia Grabar ◽  
Lucas Emanuel Silva Oliveira ◽  
Claudia Maria Cabral Moro ◽  
...  

Abstract Automatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.


2021 ◽  
Vol 11 (23) ◽  
pp. 11344
Author(s):  
Wei Ke ◽  
Ka-Hou Chan

Paragraph-based datasets are hard to analyze by a simple RNN, because a long sequence always contains lengthy problems of long-term dependencies. In this work, we propose a Multilayer Content-Adaptive Recurrent Unit (CARU) network for paragraph information extraction. In addition, we present a type of CNN-based model as an extractor to explore and capture useful features in the hidden state, which represent the content of the entire paragraph. In particular, we introduce the Chebyshev pooling to connect to the end of the CNN-based extractor instead of using the maximum pooling. This can project the features into a probability distribution so as to provide an interpretable evaluation for the final analysis. Experimental results demonstrate the superiority of the proposed approach, being compared to the state-of-the-art models.


2019 ◽  
Vol 9 (18) ◽  
pp. 3908 ◽  
Author(s):  
Jintae Kim ◽  
Shinhyeok Oh ◽  
Oh-Woog Kwon ◽  
Harksoo Kim

To generate proper responses to user queries, multi-turn chatbot models should selectively consider dialogue histories. However, previous chatbot models have simply concatenated or averaged vector representations of all previous utterances without considering contextual importance. To mitigate this problem, we propose a multi-turn chatbot model in which previous utterances participate in response generation using different weights. The proposed model calculates the contextual importance of previous utterances by using an attention mechanism. In addition, we propose a training method that uses two types of Wasserstein generative adversarial networks to improve the quality of responses. In experiments with the DailyDialog dataset, the proposed model outperformed the previous state-of-the-art models based on various performance measures.


2019 ◽  
Vol 29 (11n12) ◽  
pp. 1727-1740 ◽  
Author(s):  
Hongming Zhu ◽  
Yi Luo ◽  
Qin Liu ◽  
Hongfei Fan ◽  
Tianyou Song ◽  
...  

Multistep flow prediction is an essential task for the car-sharing systems. An accurate flow prediction model can help system operators to pre-allocate the cars to meet the demand of users. However, this task is challenging due to the complex spatial and temporal relations among stations. Existing works only considered temporal relations (e.g. using LSTM) or spatial relations (e.g. using CNN) independently. In this paper, we propose an attention to multi-graph convolutional sequence-to-sequence model (AMGC-Seq2Seq), which is a novel deep learning model for multistep flow prediction. The proposed model uses the encoder–decoder architecture, wherein the encoder part, spatial and temporal relations are encoded simultaneously. Then the encoded information is passed to the decoder to generate multistep outputs. In this work, specific multiple graphs are constructed to reflect spatial relations from different aspects, and we model them by using the proposed multi-graph convolution. Attention mechanism is also used to capture the important relations from previous information. Experiments on a large-scale real-world car-sharing dataset demonstrate the effectiveness of our approach over state-of-the-art methods.


2020 ◽  
Vol 34 (05) ◽  
pp. 9612-9619
Author(s):  
Zhao Zhang ◽  
Fuzhen Zhuang ◽  
Hengshu Zhu ◽  
Zhiping Shi ◽  
Hui Xiong ◽  
...  

The rapid proliferation of knowledge graphs (KGs) has changed the paradigm for various AI-related applications. Despite their large sizes, modern KGs are far from complete and comprehensive. This has motivated the research in knowledge graph completion (KGC), which aims to infer missing values in incomplete knowledge triples. However, most existing KGC models treat the triples in KGs independently without leveraging the inherent and valuable information from the local neighborhood surrounding an entity. To this end, we propose a Relational Graph neural network with Hierarchical ATtention (RGHAT) for the KGC task. The proposed model is equipped with a two-level attention mechanism: (i) the first level is the relation-level attention, which is inspired by the intuition that different relations have different weights for indicating an entity; (ii) the second level is the entity-level attention, which enables our model to highlight the importance of different neighboring entities under the same relation. The hierarchical attention mechanism makes our model more effective to utilize the neighborhood information of an entity. Finally, we extensively validate the superiority of RGHAT against various state-of-the-art baselines.


Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 342
Author(s):  
Guojing Huang ◽  
Qingliang Chen ◽  
Congjian Deng

With the development of E-commerce, online advertising began to thrive and has gradually developed into a new mode of business, of which Click-Through Rates (CTR) prediction is the essential driving technology. Given a user, commodities and scenarios, the CTR model can predict the user’s click probability of an online advertisement. Recently, great progress has been made with the introduction of Deep Neural Networks (DNN) into CTR. In order to further advance the DNN-based CTR prediction models, this paper introduces a new model of FO-FTRL-DCN, based on the prestigious model of Deep&Cross Network (DCN) augmented with the latest optimization technique of Follow The Regularized Leader (FTRL) for DNN. The extensive comparative experiments on the iPinYou datasets show that the proposed model has outperformed other state-of-the-art baselines, with better generalization across different datasets in the benchmark.


2020 ◽  
Vol 34 (05) ◽  
pp. 9749-9756
Author(s):  
Junnan Zhu ◽  
Yu Zhou ◽  
Jiajun Zhang ◽  
Haoran Li ◽  
Chengqing Zong ◽  
...  

Multimodal summarization with multimodal output (MSMO) is to generate a multimodal summary for a multimodal news report, which has been proven to effectively improve users' satisfaction. The existing MSMO methods are trained by the target of text modality, leading to the modality-bias problem that ignores the quality of model-selected image during training. To alleviate this problem, we propose a multimodal objective function with the guidance of multimodal reference to use the loss from the summary generation and the image selection. Due to the lack of multimodal reference data, we present two strategies, i.e., ROUGE-ranking and Order-ranking, to construct the multimodal reference by extending the text reference. Meanwhile, to better evaluate multimodal outputs, we propose a novel evaluation metric based on joint multimodal representation, projecting the model output and multimodal reference into a joint semantic space during evaluation. Experimental results have shown that our proposed model achieves the new state-of-the-art on both automatic and manual evaluation metrics. Besides, our proposed evaluation method can effectively improve the correlation with human judgments.


Author(s):  
Elaheh Barati ◽  
Xuewen Chen

In reinforcement learning algorithms, leveraging multiple views of the environment can improve the learning of complicated policies. In multi-view environments, due to the fact that the views may frequently suffer from partial observability, their level of importance are often different. In this paper, we propose a deep reinforcement learning method and an attention mechanism in a multi-view environment. Each view can provide various representative information about the environment. Through our attention mechanism, our method generates a single feature representation of environment given its multiple views. It learns a policy to dynamically attend to each view based on its importance in the decision-making process. Through experiments, we show that our method outperforms its state-of-the-art baselines on TORCS racing car simulator and three other complex 3D environments with obstacles. We also provide experimental results to evaluate the performance of our method on noisy conditions and partial observation settings.


2019 ◽  
Vol 9 (7) ◽  
pp. 1330 ◽  
Author(s):  
Yalong Jiang ◽  
Zheru Chi

Although a state-of-the-art performance has been achieved in pixel-specific tasks, such as saliency prediction and depth estimation, convolutional neural networks (CNNs) still perform unsatisfactorily in human parsing where semantic information of detailed regions needs to be perceived under the influences of variations in viewpoints, poses, and occlusions. In this paper, we propose to improve the robustness of human parsing modules by introducing a depth-estimation module. A novel scheme is proposed for the integration of a depth-estimation module and a human-parsing module. The robustness of the overall model is improved with the automatically obtained depth labels. As another major concern, the computational efficiency is also discussed. Our proposed human parsing module with 24 layers can achieve a similar performance as the baseline CNN model with over 100 layers. The number of parameters in the overall model is less than that in the baseline model. Furthermore, we propose to reduce the computational burden by replacing a conventional CNN layer with a stack of simplified sub-layers to further reduce the overall number of trainable parameters. Experimental results show that the integration of two modules contributes to the improvement of human parsing without additional human labeling. The proposed model outperforms the benchmark solutions and the capacity of our model is better matched to the complexity of the task.


Sign in / Sign up

Export Citation Format

Share Document