scholarly journals Question-aware memory network for multi-hop question answering in human–robot interaction

Author(s):  
Xinmeng Li ◽  
Mamoun Alazab ◽  
Qian Li ◽  
Keping Yu ◽  
Quanjun Yin

AbstractKnowledge graph question answering is an important technology in intelligent human–robot interaction, which aims at automatically giving answer to human natural language question with the given knowledge graph. For the multi-relation question with higher variety and complexity, the tokens of the question have different priority for the triples selection in the reasoning steps. Most existing models take the question as a whole and ignore the priority information in it. To solve this problem, we propose question-aware memory network for multi-hop question answering, named QA2MN, to update the attention on question timely in the reasoning process. In addition, we incorporate graph context information into knowledge graph embedding model to increase the ability to represent entities and relations. We use it to initialize the QA2MN model and fine-tune it in the training process. We evaluate QA2MN on PathQuestion and WorldCup2014, two representative datasets for complex multi-hop question answering. The result demonstrates that QA2MN achieves state-of-the-art Hits@1 accuracy on the two datasets, which validates the effectiveness of our model.

2019 ◽  
Vol 8 (10) ◽  
pp. 428 ◽  
Author(s):  
Bingchuan Jiang ◽  
Liheng Tan ◽  
Yan Ren ◽  
Feng Li

The core of intelligent virtual geographical environments (VGEs) is the formal expression of geographic knowledge. Its purpose is to transform the data, information, and scenes of a virtual geographic environment into “knowledge” that can be recognized by computer, so that the computer can understand the virtual geographic environment more easily. A geographic knowledge graph (GeoKG) is a large-scale semantic web that stores geographical knowledge in a structured form. Based on a geographic knowledge base and a geospatial database, intelligent interactions with virtual geographical environments can be realized by natural language question answering, entity links, and so on. In this paper, a knowledge-enhanced Virtual geographical environments service framework is proposed. We construct a multi-level semantic parsing model and an enhanced GeoKG for structured geographic information data, such as digital maps, 3D virtual scenes, and unstructured information data. Based on the GeoKG, we propose a bilateral LSTM-CRF (long short-term memory- conditional random field) model to achieve natural language question answering for VGEs and conduct experiments on the method. The results prove that the method of intelligent interaction based on the knowledge graph can bridge the distance between people and virtual environments.


2020 ◽  
Vol 12 (1) ◽  
pp. 58-73
Author(s):  
Sofia Thunberg ◽  
Tom Ziemke

AbstractInteraction between humans and robots will benefit if people have at least a rough mental model of what a robot knows about the world and what it plans to do. But how do we design human-robot interactions to facilitate this? Previous research has shown that one can change people’s mental models of robots by manipulating the robots’ physical appearance. However, this has mostly not been done in a user-centred way, i.e. without a focus on what users need and want. Starting from theories of how humans form and adapt mental models of others, we investigated how the participatory design method, PICTIVE, can be used to generate design ideas about how a humanoid robot could communicate. Five participants went through three phases based on eight scenarios from the state-of-the-art tasks in the RoboCup@Home social robotics competition. The results indicate that participatory design can be a suitable method to generate design concepts for robots’ communication in human-robot interaction.


Author(s):  
Lianli Gao ◽  
Pengpeng Zeng ◽  
Jingkuan Song ◽  
Yuan-Fang Li ◽  
Wu Liu ◽  
...  

To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA. Compared with image QA that focuses primarily on understanding the associations between image region-level details and corresponding questions, video QA requires a model to jointly reason across both spatial and long-range temporal structures of a video as well as text to provide an accurate answer. In this paper, we specifically tackle the problem of video QA by proposing a Structured Two-stream Attention network, namely STA, to answer a free-form or open-ended natural language question about the content of a given video. First, we infer rich longrange temporal structures in videos using our structured segment component and encode text features. Then, our structured two-stream attention component simultaneously localizes important visual instance, reduces the influence of background video and focuses on the relevant text. Finally, the structured two-stream fusion component incorporates different segments of query and video aware context representation and infers the answers. Experiments on the large-scale video QA dataset TGIF-QA show that our proposed method significantly surpasses the best counterpart (i.e., with one representation for the video input) by 13.0%, 13.5%, 11.0% and 0.3 for Action, Trans., TrameQA and Count tasks. It also outperforms the best competitor (i.e., with two representations) on the Action, Trans., TrameQA tasks by 4.1%, 4.7%, and 5.1%.


2007 ◽  
Vol 13 (2) ◽  
pp. 185-189
Author(s):  
ROBERT DALE

“Powerset Hype to Boiling Point”, said a February headline on TechCrunch. In the last installment of this column, I asked whether 2007 would be the year of question-answering. My query was occasioned by a number of new attempts at natural language question-answering that were being promoted in the marketplace as the next advance upon search, and particularly by the buzz around the stealth-mode natural language search company Powerset. That buzz continued with a major news item in the first quarter of this year: in February, Xerox PARC and PowerSet struck a much-anticipated deal whereby PowerSet won exclusive rights to use PARC's natural language technology, as announced in a VentureBeat posting. Following the scoop, other news sources drew the battle lines with titles like “Can natural language search bring down Google?”, “Xerox vs. Google?”, and “Powerset and Xerox PARC team up to beat Google”. An April posting on Barron's Online noted that an analyst at Global Equities Research had cited Powerset in his downgrading of Google from Buy to Neutral. And, all this on the basis of a product which, at the time of writing, very few people have actually seen. Indications are that the search engine is expected to go live by the end of the year, so we have a few more months to wait to see whether this really is a Google-killer. Meanwhile, another question remaining unanswered is what happened to the Powerset engineer who seemed less sure about the technology's capabilities: see the segment at the end of D7TV's PartyCrasher video from the Powerset launch party. For a more confident appraisal of natural language search, check out the podcast of Barney Pell, CEO of Powerset, giving a lecture at the University of California–Berkeley.


Sign in / Sign up

Export Citation Format

Share Document