scholarly journals Adaptable Conversational Machines

AI Magazine ◽  
2020 ◽  
Vol 41 (3) ◽  
pp. 28-44
Author(s):  
Nurul Lubis ◽  
Michael Heck ◽  
Carel Van Niekerk ◽  
Milica Gasic

In recent years we have witnessed a surge in machine learning methods that provide machines with conversational abilities. Most notably, neural-network–based systems have set the state of the art for difficult tasks such as speech recognition, semantic understanding, dialogue management, language generation, and speech synthesis. Still, unlike for the ancient game of Go for instance, we are far from achieving human-level performance in dialogue. The reasons for this are numerous. One property of human–human dialogue that stands out is the infinite number of possibilities of expressing oneself during the conversation, even when the topic of the conversation is restricted. A typical solution to this problem was scaling-up the data. The most prominent mantra in speech and language technology has been “There is no data like more data.” However, the researchers now are focused on building smarter algorithms — algorithms that can learn efficiently from just a few examples. This is an intrinsic property of human behavior: an average human sees during their lifetime a fraction of data that we nowadays present to machines. A human can even have an intuition about a solution before ever experiencing an example solution. The human-inspired ability to adapt may just be one of the keys in pushing dialogue systems toward human performance. This article reviews advancements in dialogue systems research with a focus on the adaptation methods for dialogue modeling, and ventures to have a glance at the future of research on adaptable conversational machines.

2012 ◽  
Vol 28 (1) ◽  
pp. 59-73 ◽  
Author(s):  
Olivier Pietquin ◽  
Helen Hastie

AbstractUser simulation is an important research area in the field of spoken dialogue systems (SDSs) because collecting and annotating real human–machine interactions is often expensive and time-consuming. However, such data are generally required for designing, training and assessing dialogue systems. User simulations are especially needed when using machine learning methods for optimizing dialogue management strategies such as Reinforcement Learning, where the amount of data necessary for training is larger than existing corpora. The quality of the user simulation is therefore of crucial importance because it dramatically influences the results in terms of SDS performance analysis and the learnt strategy. Assessment of the quality of simulated dialogues and user simulation methods is an open issue and, although assessment metrics are required, there is no commonly adopted metric. In this paper, we give a survey of User Simulations Metrics in the literature, propose some extensions and discuss these metrics in terms of a list of desired features.


AI Magazine ◽  
2021 ◽  
Vol 42 (3) ◽  
pp. 31-42
Author(s):  
Joseph Konstan ◽  
Loren Terveen

From the earliest days of the field, Recommender Systems research and practice has struggled to balance and integrate approaches that focus on recommendation as a machine learning or missing-value problem with ones that focus on machine learning as a discovery tool and perhaps persuasion platform. In this article, we review 25 years of recommender systems research from a human-centered perspective, looking at the interface and algorithm studies that advanced our understanding of how system designs can be tailored to users objectives and needs. At the same time, we show how external factors, including commercialization and technology developments, have shaped research on human-centered recommender systems. We show how several unifying frameworks have helped developers and researchers alike incorporate thinking about user experience and human decision-making into their designs. We then review the challenges, and the opportunities, in today’s recommenders, looking at how deep learning and optimization techniques can integrate with both interface designs and human performance statistics to improve recommender effectiveness and usefulness


2020 ◽  
Vol 8 ◽  
pp. 281-295
Author(s):  
Qi Zhu ◽  
Kaili Huang ◽  
Zheng Zhang ◽  
Xiaoyan Zhu ◽  
Minlie Huang

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts on both user and system sides. About 60% of the dialogues have cross-domain user goals that favor inter-domain dependency and encourage natural transition across domains in conversation. We also provide a user simulator and several benchmark models for pipelined task-oriented dialogue systems, which will facilitate researchers to compare and evaluate their models on this corpus. The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.


2020 ◽  
Vol 34 (04) ◽  
pp. 3970-3979
Author(s):  
Sahil Garg ◽  
Irina Rish ◽  
Guillermo Cecchi ◽  
Palash Goyal ◽  
Sarik Ghazarian ◽  
...  

We propose a novel dialogue modeling framework, the first-ever nonparametric kernel functions based approach for dialogue modeling, which learns hashcodes as text representations; unlike traditional deep learning models, it handles well relatively small datasets, while also scaling to large ones. We also derive a novel lower bound on mutual information, used as a model-selection criterion favoring representations with better alignment between the utterances of participants in a collaborative dialogue setting, as well as higher predictability of the generated responses. As demonstrated on three real-life datasets, including prominently psychotherapy sessions, the proposed approach significantly outperforms several state-of-art neural network based dialogue systems, both in terms of computational efficiency, reducing training time from days or weeks to hours, and the response quality, achieving an order of magnitude improvement over competitors in frequency of being chosen as the best model by human evaluators.


Author(s):  
Christopher W. Myers

An important goal of training systems research is the ability to train teams to criterion while simultaneously minimizing training resources. One promising approach is to develop synthetic agents that act as full-fledged members of a team. Five experts will highlight successes, failures, and continuing challenges associated with the development, validation, and deployment of synthetic agents as full-fledged teammates. The panel will provide an intimate look “under the hood” of synthetic agents, describe what each has found useful for developing a synthetic teammate that “plays well with others,” and discuss the key roadblocks that must be overcome for the further inclusion of synthetic teammates within human training systems. The lessons learned from these panelists will be of value to those interested in cognitive engineering and human performance modeling.


Author(s):  
Raivis Skadiņš ◽  
Mārcis Pinnis ◽  
Artūrs Vasiļevskis ◽  
Andrejs Vasiļjevs ◽  
Valters Šics ◽  
...  

The paper describes the Latvian e-government language technology platform HUGO.LV. It provides an instant translation of text snippets, formatting-rich documents and websites, an online computer-assisted translation tool with a built-in translation memory, a website translation widget, speech recognition and speech synthesis services, a terminology management and publishing portal, language data storage, analytics, and data sharing functionality. The paper describes the motivation for the creation of the platform, its main components, architecture, usage statistics, conclusions, and future developments. Evaluation results of language technology tools integrated in the platform are provided.


Author(s):  
Roger K. Moore

The past twenty-five years have witnessed a steady improvement in the capabilities of spoken language technology, first in the research laboratory and more recently in the commercial marketplace. Progress has reached a point where automatic speech recognition software for dictating documents onto a computer is available as an inexpensive consumer product in most computer stores, text-to-speech synthesis can be heard in public places giving automated voice announcements, and interactive voice response is becoming a familiar option for people paying bills or booking cinema tickets over the telephone. This article looks at the main computational approaches employed in contemporary spoken language processing. It discusses acoustic modelling, language modelling, pronunciation modelling, and noise modelling. The article also considers future prospects in the context of the obvious shortcomings of current technology, and briefly addresses the potential for achieving a unified approach to human and machine spoken language processing.


2018 ◽  
Vol 28 (3) ◽  
pp. 1133-1138
Author(s):  
Lindita Ademi ◽  
Valbon Ademi

The problem for developing a TTS (text-to-speech) is a very active field of research. As the Human-Computer Interfaces (HCI) come of age, the need for a more ergonomic and natural interface than the current one (keyboard, mouse, etc.) is being constantly felt. Talking of natural interfaces, what comes to mind, is sound (speech) and sight (vision). These form the basis of many intelligent systems research like robotics. Moreover, speech can also serve as an excellent interface for visually impaired , or people with motor neuron disorders. In this paper we attempt at developing a TTS system for Albanian Language. A lot of commercial systems are available for many foreign languages (mostly English), but there is yet to be a competitive system available for Albanian language. Although the task of building very high quality, unlimited vocabulary text-to-speech (TTS) system is still a difficult one, with many open research questions, we believe the building of reasonable quality voices for many tasks can serve our needs. Here we have worked with standard Albanian, the most commonly spoken. We hope to easily extend the system to other languages, since there are a lot of underlying similarities between languages. Albanian language being highly phonetic, result in simple letter-to-sound rules. We used the standard concatenative synthesis. The main problem faced by us was to make the synthesized speech sound natural. We investigated the reasons for the mechanical sounding speech and developed different synthesis models to overcome some of those problems. Moreover, we implemented some standard and also novel intonation and duration modification algorithms, which can be incorporated into the TTS at a later stage. Our main achievement was reasonably legible speech with an unlimited vocabulary. The following paper presents a brief overview of the main text-to-speech synthesis problem and its subproblems, and the initial work done in building a TTS for Albanian.


Sign in / Sign up

Export Citation Format

Share Document