Alexa Prize Socialbot Grand Challenge 4 Proceedings


Dilek Hakkani-Tür, Senior Principal Scientist, Alexa AI

Amazon Alexa Prize

Further Advances in Open Domain Dialog Systems in the Third Alexa Prize Socialbot Grand Challenge


Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. The Alexa Prize Socialbot Grand Challenge was launched in 2016 to tackle the problem of achieving natural, sustained, coherent and engaging open-domain dialogs. In the fourth iteration of the competition, university teams have incorporated semantic parsing, common sense reasoning, personalization, neural response generation, as well as novel response ranking models into the state of the art. The Fourth Socialbot Grand Challenge included an improved version of the CoBot (conversational bot) toolkit from the prior competition, along with upgraded topic and intent classifiers, BERT- based named entity recognition model, a punctuation model that injects punctuation marks into the ASR output, and a new neural response generator trained on conversations with Alexa Let’s Chat. This paper outlines the advances developed by the university teams as well as the Alexa Prize team to move closer to the Grand Challenge objective, including open domain natural language understanding, commonsense reasoning, dialog management, neural response generation, and dialog evaluation. As of the end of the final feedback phase, the top 7-day average rating achieved by a socialbot was 3.56, with the top 90th percentile conversation duration of 12 minutes 7 seconds.


Shui Hu, Yang Liu, Anna Gottardi, Behnam Hedayatnia, Anju Khatri, Anjali Chadha, Qinlang Chen, Pankaj Rajan, Ali Binici, Varun Somani, Yao Lu, Prerna Dwivedi, Lucy Hu, Hangjie Shi, Sattvik Sahai, Mihail Eric, Karthik Gopalakrishnan, Seokhwan Kim, Spandana Gella, Alexandros Papangelis, Patrick Lange, Di Jin, Nicole Chartier, Mahdi Namazifar, Aishwarya Padmakumar, Sarik Ghazarian, Shereen Oraby, Anjali Narayan-Chen, Yuheng Du, Lauren Stubell, Savanna Stiff, Kate Bland, Arindam Mandal, Reza Ghanadan, Dilek Hakkani-Tur

Czech Technical University in Prague - Alquist

Alquist 4.0: Towards Social Intelligence Using Generative Models and Dialogue Personalization


The open domain-dialogue system Alquist has a goal to conduct a coherent and engaging conversation that can be considered as one of the benchmarks of social intelligence. The fourth version of the system, developed within the Alexa Prize Socialbot Grand Challenge 4, brings two main innovations. The first addresses coherence, and the second addresses the engagingness of the conversation.

For innovations regarding coherence, we propose a novel hybrid approach com- bining hand-designed responses and a generative model. The proposed approach utilizes hand-designed dialogues, out-of-domain detection, and a neural response generator. Hand-designed dialogues walk the user through high-quality conversa- tional flows. The out-of-domain detection recognizes that the user diverges from the predefined flow and prevents the system from producing a scripted response that might not make sense for unexpected user input. Finally, the neural response generator generates a response based on the context of the dialogue that correctly reacts to the unexpected user input and returns the dialogue to the boundaries of hand-designed dialogues.

The innovations for engagement that we propose are mostly inspired by the famous exploration-exploitation dilemma. To conduct an engaging conversation with the dialogue partners, one has to learn their preferences and interests—exploration. Moreover, to engage the partner, we have to utilize the knowledge we have already learned—exploitation.

In this work, we present the principles and inner workings of individual components of the open-domain dialogue system Alquist developed within the Alexa Prize Socialbot Grand Challenge 4 and the experiments we have conducted to evaluate them.


Jakub Konrád, Jan Pichl, Petr Marek, Petr Lorenc, Van Duy Ta, Ondrˇej Kobza

Emory University - Emora

An Approach to Inference-Driven Dialogue Management within a Social Chatbot


We present a chatbot implementing a novel dialogue management approach based on logical inference. Instead of framing conversation a sequence of response generation tasks, we model conversation as a collaborative inference process in which speakers share information to synthesize new knowledge in real time. Our chatbot pipeline accomplishes this modelling in three broad stages. The first stage translates user utterances into a symbolic predicate representation. The second stage then uses this structured representation in conjunction with a larger knowledge base to synthesize new predicates using efficient graph matching. In the third and final stage, our bot selects a small subset of predicates and translates them into an English response. This approach lends itself to understanding latent semantics of user inputs, flexible initiative taking, and responses that are novel and coherent with the dialogue context.


Sarah E. Finch, James D. Finch, Daniil Huryn, William (Mack) Hutsell, Xiaoyuan (Sophy) Huang, Han He, Jinho D. Choi

Moscow Institute of Physics and Technology - DREAM

DREAM Technical Report for the Alexa Prize 4


In this report, we present the DREAM 2 Socialbot design and share scientific and technology contributions made towards developing a fluent and meaningful socialbot for Alexa Prize 4. Building on top of the last year’s solution we added a rich plethora of the script-driven skills created with the help of the novel Dialogue Flow Framework. To lay down the foundation for the discourse-driven dialogue strategy management we introduced tag-based Response Selector and Speech Functions Classifier. We also began working on User and Bot Persona Knowledge Graphs as well as incorporated our work on World Knowledge Graph alongside with Entity Linking. The final version of DREAM 2 Socialbot is still a hybrid system that combines rule-based, deep learning, and knowledge based driven components, but it moves closer to a goal-aware system that can recognize users’ and own goals and drive the dialogue strategically.


Dilyara Baymurzina, Denis Kuznetsov, Dmitry Evseev, Dmitry Karpov, Alsu Sagirova, Anton Peganov, Fedor Ignatov, Elena Ermakova, Daniil Cherniavskii, Sergey Kumeyko, Oleg Serikov, Yury Kuratov, Lidiya Ostyakova, Daniel Kornev, Mikhail Burtsev

Polytechnic University of Madrid - Genuine2

Genuine2: An open domain chatbot based on generative models


This paper describes the architecture, methodology and results of the Genuine2 chatbot for the Alexa Socialbot Grand Challenge 4. In contrast to previous years, our bot heavily relies on the usage of different types of generative models coordi- nated through on a dialogue management policy that targets dialogue coherence and topic continuity. Different dialogue generators were incorporated to give variability to the conversations, including the dynamic incorporation of persona profiles. Given the characteristics and differences of the response generators, we developed mechanisms to control the quality of the responses (e.g., detection of toxicity, emotions, avoiding repetitions, increase engagement and avoid mislead- ing/erroneous responses). Besides, our system extends the capabilities of the Cobot architecture by incorporating modules to handle toxic users, question detection, up to 6 different types of emotions, new topics classification using zero-shot learning approaches, extended knowledge-grounded information, several strategies when using guided (predefined prompts), and emotional voices. The paper finishes with analysis of our results (including ratings, performance per topic, and generator), as well as the results of a reference-free metric that could complement the capabilities of the ranker to select better answers from the generators.


Mario Rodríguez-Cantelar, Diego de la Cal, Marcos Estecha, Alicia Grande Gutiérrez, Diego Martín, Natalia Rodríguez Nuñez Milara, Ramón Martínez Jiménez, Luis Fernando D’Haro

Stanford University - Chirpy Cardinal

Neural, Neural Everywhere: Controlled Generation Meets Scaffolded, Structured Dialogue


In this paper, we present the second iteration of Chirpy Cardinal, an open-domain dialogue agent developed for the Alexa Prize SGC4 competition. Building on the success of the SGC3 Chirpy, we focus on improving conversational flexibility, initiative, and coherence. We introduce a variety of methods for controllable neural generation, ranging from prefix-based neural decoding over a symbolic scaffolding, to pure neural modules, to a novel hybrid infilling-based method that combines the best of both worlds. Additionally, we enhance previous news, music and movies modules with new APIs, as well as make major improvements in entity linking, topical transitions, and latency. Finally, we expand the variety of responses via new modules that focus on personal issues, sports, food, and even extraterrestrial life! These components come together to create a refreshed Chirpy Cardinal that is able to initiate conversations filled with interesting facts, engaging topics, and heartfelt responses.


Ethan A. Chi, Caleb Chiam, Trenton Chang, Swee Kiat Lim, Chetanya Rastogi, Alexander Iyabor, Yutong He, Hari Sowrirajan, Avanika Narayan, Jillian Tang, Haojun Li, Ashwin Paranjape, Christopher D. Manning

SUNY Buffalo - PROTO

Proto: A Neural Cocktail for Generating Appealing Conversations


In this paper, we present our Alexa Prize Grand Challenge 4 socialbot: Proto. Leveraging diverse sources of world knowledge, and powered by a suite of neural and rule-based natural language understanding modules, state-of-the-art neural generators, novel state-based deterministic generators, an ensemble of neural re- rankers, a robust post-processing algorithm, and an efficient overall conversation strategy, Proto strives to be able to converse coherently about a diverse range of topics of interest to humans, and provide a memorable experience to the user. In this paper we dissect and analyze the different components and conversation strategies implemented by our socialbot, which enables us to generate colloquial, empathetic, engaging, self-rectifying, factually correct, and on-topic response, which has helped us achieve consistent scores throughout the competition.


Sougata Saha, Souvik Das, Elizabeth Soper, Erin Pacquetet, Rohini K. Srihari

University of California, Santa Cruz - Athena

Athena 2.0: Discourse and User Modeling in Open Domain Dialogue


Conversational agents are consistently growing in popularity and many people interact with them every day. While many conversational agents act as personal assistants, they can have many different goals. Some are task-oriented, such as providing customer support for a bank or making a reservation. Others are designed to be empathetic and to form emotional connections with the user. The Alexa Prize Challenge aims to create a socialbot, which allows the user to engage in coherent conversations, on a range of popular topics that will interest the user. Here we describe Athena 2.0, UCSC’s conversational agent for Amazon’s Socialbot Grand Challenge 4. Athena 2.0 utilizes a novel knowledge-grounded discourse model that tracks the entity links that Athena introduces into the dialogue, and uses them to constrain named-entity recognition and linking, and coreference resolution. Athena 2.0 also relies on a user model to personalize topic selection and other aspects of the conversation to individual users.


Omkar Patil, Lena Reed, Kevin K. Bowden, Juraj Juraska, Wen Cui, Vrindavan Harrison, Rishi Rajasekaran, Angela Ramirez, Cecilia Li, Eduardo Zamora, Phillip Lee, Jeshwanth Bheemanpally, Rohan Pandey, Adwait Ratnaparkhi, Marilyn Walker

University of Southern California - Viola

Viola: A Topic Agnostic Generate-and-Rank Dialogue System


We present Viola, an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank ap- proach. Leveraging recent advances of generative dialogue systems powered by large language models, Viola fetches a batch of response candidates from various neural dialogue models trained with different datasets and knowledge-grounding inputs. Additional responses originating from template-based generators are also considered, depending on the user’s input and detected entities. The hand-crafted generators build on a dynamic knowledge graph injected with rich content that is crawled from the web and automatically processed on a daily basis. Viola’s response ranker is a fine-tuned polyencoder that chooses the best response given the dialogue history. While dedicated annotations for the polyencoder alone can indirectly steer it away from choosing problematic responses, we add rule-based safety nets to detect neural degeneration and a dedicated classifier to filter out offensive content. We analyze conversations that Viola took part in for the Alexa Prize Socialbot Grand Challenge 4 and discuss the strengths and weaknesses of our approach. Lastly, we suggest future work with a focus on curating conversation data specifcially for socialbots that will contribute towards a more robust data-driven socialbot.


Hyundong Cho, Basel Shbita, Kartik Shenoy, Shuai Liu, Nikhil Patel, Hitesh Pindikanti, Jennifer Lee, Jonathan May

University of Texas at Dallas - CASPR

CASPR: A Commonsense Reasoning-based Conversational Socialbot


We report on the design and development of the CASPR system, a socialbot designed to compete in the Amazon Alexa Socialbot Challenge 4. CASPR’s distin- guishing characteristic is that it will use automated commonsense reasoning to truly “understand” dialogs, allowing it to converse like a human. Three main require- ments of a socialbot are that it should be able to “understand” users’ utterances, possess a strategy for holding a conversation, and be able to learn new knowledge. We developed techniques such as conversational knowledge template (CKT) to approximate commonsense reasoning needed to hold a conversation on specific topics. We present the philosophy behind CASPR’s design as well as details of its implementation. We also report on CASPR’s performance as well as discuss lessons learned.


Kinjal Basu, Huaduo Wang, Nancy Dominguez, Xiangci Li, Fang Li, Sarat Chandra Varanasi, Gopal Gupta

Meet the Teams

Alexa Prize Socialbot Grand Challenge 3