answer extraction
Recently Published Documents


TOTAL DOCUMENTS

69
(FIVE YEARS 17)

H-INDEX

8
(FIVE YEARS 3)

2022 ◽  
Vol 40 (4) ◽  
pp. 1-24
Author(s):  
Yongqi Li ◽  
Wenjie Li ◽  
Liqiang Nie

In recent years, conversational agents have provided a natural and convenient access to useful information in people’s daily life, along with a broad and new research topic, conversational question answering (QA). On the shoulders of conversational QA, we study the conversational open-domain QA problem, where users’ information needs are presented in a conversation and exact answers are required to extract from the Web. Despite its significance and value, building an effective conversational open-domain QA system is non-trivial due to the following challenges: (1) precisely understand conversational questions based on the conversation context; (2) extract exact answers by capturing the answer dependency and transition flow in a conversation; and (3) deeply integrate question understanding and answer extraction. To address the aforementioned issues, we propose an end-to-end Dynamic Graph Reasoning approach to Conversational open-domain QA (DGRCoQA for short). DGRCoQA comprises three components, i.e., a dynamic question interpreter (DQI), a graph reasoning enhanced retriever (GRR), and a typical Reader, where the first one is developed to understand and formulate conversational questions while the other two are responsible to extract an exact answer from the Web. In particular, DQI understands conversational questions by utilizing the QA context, sourcing from predicted answers returned by the Reader, to dynamically attend to the most relevant information in the conversation context. Afterwards, GRR attempts to capture the answer flow and select the most possible passage that contains the answer by reasoning answer paths over a dynamically constructed context graph . Finally, the Reader, a reading comprehension model, predicts a text span from the selected passage as the answer. DGRCoQA demonstrates its strength in the extensive experiments conducted on a benchmark dataset. It significantly outperforms the existing methods and achieves the state-of-the-art performance.


2022 ◽  
Vol 24 (3) ◽  
pp. 1-16
Author(s):  
Manvi Breja ◽  
Sanjay Kumar Jain

Why-type non-factoid questions are ambiguous and involve variations in their answers. A challenge in returning one appropriate answer to user requires the process of appropriate answer extraction, re-ranking and validation. There are cases where the need is to understand the meaning and context of a document rather than finding exact words involved in question. The paper addresses this problem by exploring lexico-syntactic, semantic and contextual query-dependent features, some of which are based on deep learning frameworks to depict the probability of answer candidate being relevant for the question. The features are weighted by the score returned by ensemble ExtraTreesClassifier according to features importance. An answer re-ranker model is implemented that finds the highest ranked answer comprising largest value of feature similarity between question and answer candidate and thus achieving 0.64 Mean Reciprocal Rank (MRR). Further, answer is validated by matching the answer type of answer candidate and returns the highest ranked answer candidate with matched answer type to a user.


2022 ◽  
Vol 24 (3) ◽  
pp. 0-0

Why-type non-factoid questions are ambiguous and involve variations in their answers. A challenge in returning one appropriate answer to user requires the process of appropriate answer extraction, re-ranking and validation. There are cases where the need is to understand the meaning and context of a document rather than finding exact words involved in question. The paper addresses this problem by exploring lexico-syntactic, semantic and contextual query-dependent features, some of which are based on deep learning frameworks to depict the probability of answer candidate being relevant for the question. The features are weighted by the score returned by ensemble ExtraTreesClassifier according to features importance. An answer re-ranker model is implemented that finds the highest ranked answer comprising largest value of feature similarity between question and answer candidate and thus achieving 0.64 Mean Reciprocal Rank (MRR). Further, answer is validated by matching the answer type of answer candidate and returns the highest ranked answer candidate with matched answer type to a user.


Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 322
Author(s):  
Junjie Zeng ◽  
Xiaoya Sun ◽  
Qi Zhang ◽  
Xinmeng Li

Machine Reading Comprehension (MRC) research concerns how to endow machines with the ability to understand given passages and answer questions, which is a challenging problem in the field of natural language processing. To solve the Chinese MRC task efficiently, this paper proposes an Improved Extraction-based Reading Comprehension method with Answer Re-ranking (IERC-AR), consisting of a candidate answer extraction module and a re-ranking module. The candidate answer extraction module uses an improved pre-training language model, RoBERTa-WWM, to generate precise word representations, which can solve the problem of polysemy and is good for capturing Chinese word-level features. The re-ranking module re-evaluates candidate answers based on a self-attention mechanism, which can improve the accuracy of predicting answers. Traditional machine-reading methods generally integrate different modules into a pipeline system, which leads to re-encoding problems and inconsistent data distribution between the training and testing phases; therefore, this paper proposes an end-to-end model architecture for IERC-AR to reasonably integrate the candidate answer extraction and re-ranking modules. The experimental results on the Les MMRC dataset show that IERC-AR outperforms state-of-the-art MRC approaches.


2020 ◽  
Vol 29 (06) ◽  
pp. 2050019
Author(s):  
Hadi Veisi ◽  
Hamed Fakour Shandi

A question answering system is a type of information retrieval that takes a question from a user in natural language as the input and returns the best answer to it as the output. In this paper, a medical question answering system in the Persian language is designed and implemented. During this research, a dataset of diseases and drugs is collected and structured. The proposed system includes three main modules: question processing, document retrieval, and answer extraction. For the question processing module, a sequential architecture is designed which retrieves the main concept of a question by using different components. In these components, rule-based methods, natural language processing, and dictionary-based techniques are used. In the document retrieval module, the documents are indexed and searched using the Lucene library. The retrieved documents are ranked using similarity detection algorithms and the highest-ranked document is selected to be used by the answer extraction module. This module is responsible for extracting the most relevant section of the text in the retrieved document. During this research, different customized language processing tools such as part of speech tagger and lemmatizer are also developed for Persian. Evaluation results show that this system performs well for answering different questions about diseases and drugs. The accuracy of the system for 500 sample questions is 83.6%.


2020 ◽  
Vol 24 (2) ◽  
Author(s):  
Abdullah Faiz Ur Rahman Khilji ◽  
Riyanka Manna ◽  
Sahinur Rahman Laskar ◽  
Partha Pakray ◽  
Dipankar Das ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Atif Khan ◽  
Ibrahim Ibrahim ◽  
M. Irfan Uddin ◽  
Muhammad Zubair ◽  
Shafiq Ahmad ◽  
...  

Nowadays, data are flooding into online web forums, and it is highly desirable to turn gigantic amount of data into actionable knowledge. Online web forums have become an integral part of the web and are main sources of knowledge. People use this platform to post their questions and get answers from other forum members. Usually, an initial post (question) gets more than one reply posts (answers) that make it difficult for a user to scan all of them for most relevant and quality answer. Thus, how to automatically extract the most relevant answer for a question within a thread is an important issue. In this research, we treat the task of answer extraction as classification problem. A reply post can be classified as relevant, partially relevant, or irrelevant to the initial post. To find the relevancy/similarity of a reply to the question, both lexical and nonlexical features are used. We proposed to use LinearSVC, a variant of support vector machine (SVM), for answer classification. Two selection techniques such as chi-square and univariate are employed to reduce the feature space size. The experimental results showed that LinearSVC classifier outperformed the other state-of-the-art classifiers in the context of classification accuracy for both Ubuntu and TripAdvisor (NYC) discussion forum datasets.


Sign in / Sign up

Export Citation Format

Share Document