Alexa Conversations helps you create more natural-feeling Alexa skills with fewer lines of code. The service is an AI-driven approach to dialog management that enables you to create skills that customers can interact with in a natural way - using the phrases they prefer, in the order they prefer – while freeing you to focus on the highest value parts of your experience.
We launched Alexa Conversations in the en_US locale in 2020 and are now extending availability to 8 new locales (en_AU, en_CA, en_IN, en_GB, de_DE, ja_JP, es_ES, and es_US) across 3 new languages (Japanese, German and Spanish) to help you build engaging Alexa experiences for your customers.
Conversations are more nuanced than simply understanding words and sentences. As a result, you need to accommodate a wide range of phrases and unexpected dialog turns and have the memory to sustain long back-and-forth sessions. Your skills need to flex while gathering inputs and accept too much or too little information out of sequence. Skills need to automatically track across topical shifts, reference context, and adjust to corrections. For example, a pizza order is a conversation when customers say: “a medium two-topping” (answering more than one question at once), “how many people does that feed” (context carryover, answering with a question), “pepperoni and onion” (list values), “make that a large” (correction), “apply the third coupon” (using voice to select from a visual or vocalized list), and tap choices on a screen. Supporting these conversational patterns make Alexa skills feel more natural and delights customers in new ways.
Building skills that feel natural with today’s techniques can be cumbersome and error-prone, and developers have told us they are apt to abandon or compromise their efforts. The number of ways customers can engage and the variety of dialog paths they can take often result in an explosion of states and code. For example, a pizza ordering skill with seven topping combinations might require more than 5,000 dialog paths. Using slots and intents means hard coding every element, including carrying context through every turn, building a state machine to manage the variables, and accounting for every possible phrase a customer might use. A skill that feels natural can become prohibitively complex to build and maintain, but limiting the skill produces an unnatural interaction that doesn’t fully satisfy customers. Alexa Conversations addresses this dilemma by enabling skill builders to focus on delighting their customers by handling all the complex dialog management.
Alexa Conversations uses AI to bridge the gap between what you can build manually and the vast range of possible conversations. You provide a few sample dialogs showing your ideal dialog paths and templates for the APIs you’ll need called, and AI extrapolates the spectrum of phrasing variations and dialog paths for you. The AI will also take on the dialog state and context management, including carrying context across turns, managing lists, and supporting corrections. Alexa Conversations helps customers experience natural conversations with less development effort, freeing you to focus on creating a quality experience instead of on flowcharts.
Alexa Conversations puts numerous AI innovations to work for you, including a unique method for generating training data and an end-to-end conversational management architecture. Starting with a few sample dialogs, a proprietary simulation engine applies advanced algorithms to generate tens of thousands of dialogs between agents representing Alexa and the customer. This large and variable dataset includes happy paths, phrasing variations, and uncommon alternatives to create a wider range of possible dialog paths. A novel model architecture combines this synthetic data with pre-trained components to automatically train deep-learning neural networks, including deep transformer based encoders, recurrent neural networks, and attention-based pointer networks. The trained model can predict the next steps in the dialog based on the entire conversation’s history, the current state, and the capabilities of the developer’s APIs. It can take action to drive the conversation forward, such as confirming inputs, eliciting missing information, retrieving information through an API call through your skill, or gracefully following the customer’s direction. Alexa Conversations uses AI to do the heavy lifting to create language and dialog path permutations, then manages the conversational elements for you.
When building skills with Alexa Conversations, you can provide sample dialogs in the new dialog editor, and then annotate them with Dialog Acts, Utterance Sets, and Responses with audio (Alexa Presentation Language for audio) and visual elements (Alexa Presentation Language). You can also specify when to invoke APIs along with their required arguments, so the dialog manager can gather the information to trigger your skill code. During the course of the conversations, your skill will respond to fulfill the user request. You can continuously improve your experience by updating the sample dialogs and debugging with the updated testing tools - all without refactoring your logic. Skills built using Alexa Conversations can be published using the existing certification and publishing tools and processes, and Alexa Conversations can be added to existing custom skills. We provide fully functional templates along with tutorials and a guided tour of the dialog annotation process to get you started quickly.
We worked with multiple partners to obtain feedback on the functioning of the product. They have shared anecdotes from their hands-on experience.
“Troublesome dialog management was not needed, that was fantastic. I want to make better skills by using Alexa Conversations, and I hope more developers in Japan will be able to experience this.” - Takara, Developer based in Japan, Skill: たまごの時間 (Egg Time)
“CogniVocal leveraged the power and convenience of Alexa Conversations without the high-level development set up required for an Alexa Skills Kit deployment. The interface abstracts away of a lot of code set up and management and let us concentrate on the experience for our users on our Wine Match skill.” - Dyung Ngo, CTO, CogniVocal, Skill: Wine Match
“Alexa Conversations lets customers tell us what they want, when they want it, in plain English. Now they can order instantaneously, without interrupting work, rest, or game time. Alexa Conversations frees customers to order online instantaneously in a way that is touch-free, screen-free, and hassle-free.” - Papa John’s, Skill: Papa John’s
“The graphical interface is intuitive and we did not require a specific background in computer science or NLP. At the same time, several advanced features enabled us to build more flexible dialog paths and natural conversations with our customers. The technical implementation into our existing skill was straight forward and delegations between Alexa Conversations and the interaction model proved to be very reliable. We are very happy about the outcome and excited to implement Alexa Conversation in future use cases.” - Deutsche Bahn, Skill: Deutsche Bahn
“Alexa Conversations allows Modal to deliver customer experiences that are far more natural, and when you compare it to the traditional hierarchical approach, Alexa Conversations is clearly the future for voice user interface design and development.“ - Vincent Slevin, Founder and CEO, Modal, Skill: Mamasitas