Introducing Alexa Conversations (beta), a New AI-Driven Approach to Providing Conversational Experiences That Feel More Natural

Drew Meyer Jul 22, 2020
News Voice User Interface

Alexa Conversations (beta) is now available to help you create more natural-feeling Alexa skills with fewer lines of code. Alexa Conversations is a new AI-driven approach to dialog management that enables you to create skills that customers can interact with in a natural, less constrained way - using the phrases they prefer, in the order they prefer – while freeing you to focus on the highest value parts of your experience. The Alexa Conversations feature is available in the en_US locale. Visit the Alexa Developer console to start building with Alexa Conversations. Then enter the Alexa Skills Challenge: Alexa Conversations contest to compete for $100,000 in cash prizes.

Making Conversations Feel Natural

Conversations are more nuanced than simply understanding words and sentences. As a result, you need to accommodate a wide range of phrases and unexpected dialog turns and have the memory to sustain long back- and-forth sessions. Your skills need to flex while gathering inputs and accept too much or too little information out of sequence. Skills need to automatically track across topical shifts, reference context, and adjust to corrections. For example, a pizza order is a conversation when customers say: “a medium two-topping” (answering more than one question at once), “how many people does that feed” (context carryover, anaphora, answering with a question), “pepperoni and onion” (list values), “make that a large” (correction), “apply the third coupon” (using voice to select from a visual or vocalized list), and tap choices on a screen. Supporting these conversational patterns make Alexa skills feel more natural and delights customers in new ways.

The Skill-Builder’s Dilemma

Building skills that feel natural with today’s techniques can be cumbersome and error-prone, and developers have told us they are apt to abandon or compromise their efforts. The number of ways customers can engage and the variety of dialog paths they can take often result in a combinatorial explosion of states and code. For example, a pizza ordering skill with seven topping combinations might require more than 5,000 dialog paths. Using slots and intents means hard coding every element, including carrying context through every turn, building a state machine to manage the variables, and accounting for every possible phrase a customer might use. A skill that feels natural can become prohibitively complex to build and maintain, but limiting the skill produces an unnatural interaction that doesn’t fully satisfy customers.

Fill in the Blanks with Artificial Intelligence

Alexa Conversations uses AI to bridge the gap between what you can build manually and the vast range of possible conversations. You provide a few sample dialogs showing your ideal dialog paths and templates for the APIs you’ll need called, and AI extrapolates the spectrum of phrasing variations and dialog paths for you. The AI will also take on the dialog state and context management, including carrying context across turns, managing lists, and supporting corrections. Alexa Conversations helps customers experience natural conversations with less development effort, freeing you to focus on creating a quality experience instead of on flowcharts.

The AI Behind the Scenes: How Alexa Conversations Works

Alexa Conversations puts numerous AI innovations to work for you, including a unique method for generating training data and an end-to-end conversational management architecture. Starting with a few sample dialogs, a proprietary simulation engine applies advanced algorithms to generate tens of thousands of dialogs between agents representing Alexa and the customer. This large and variable dataset includes happy paths, phrasing variations, and uncommon alternatives to create a wider range of possible dialog paths. A novel model architecture combines this synthetic data with pre-trained components to automatically train deep-learning neural networks, including deep transformer based encoders, recurrent neural networks, and attention-based pointer networks. The trained model can predict the next steps in the dialog based on the entire conversation’s history, the current state, and the capabilities of the developer’s APIs. It can take action to drive the conversation forward, such as confirming inputs, eliciting missing information, retrieving information through an API call through your skill, or gracefully following the customer’s direction. Alexa Conversations uses AI to do the heavy lifting to create language and dialog path permutations, then manages the conversational elements for you.

Using Alexa Conversations

When building skills with Alexa Conversations, you can provide sample dialogs in the new dialog editor, and then annotate them with Dialog Acts, Utterance Sets, and Responses with audio (Alexa Presentation Language for audio) and visual elements (Alexa Presentation Language). You can also specify when to invoke APIs along with their required arguments, so the dialog manager can gather the information to trigger your skill code. During the course of the conversations, your skill will respond to fulfill the user request. You can continuously improve your experience by updating the sample dialogs and debugging with the updated testing tools - all without refactoring your logic. Skills built using Alexa Conversations can be published using the existing certification and publishing tools and processes, and Alexa Conversations can be added to existing custom skills. We provide fully functional templates along with tutorials and a guided tour of the dialog annotation process to get you started quickly.

Skill Builder Feedback: “A Breakthrough for Developers”

Last year we worked with OpenTable, Uber, and Atom Tickets to get feedback on an early product design and work with us on a concept skill. As part of the Alexa Live virtual developer event, Alexa Conversations preview participants (iRobot, Philosophical Creations, and Arrive) shared anecdotes from their hands-on experience.

Today, the iRobot Home skill allows customers to schedule cleaning with their Roomba robot vacuum or Braava jet robot mop, but the rigid dialog requirements offer a limited experience. Managing this open-ended task with Alexa Conversations enhances the customer experience, allowing customers to follow any number of dialog paths, make changes without starting over, and speak more naturally.

“We’re always looking to make our robots as easy to use as possible, and voice is an important part of that journey,” says Chris Jones, CTO at iRobot. “Our upcoming Alexa Conversations powered-skill will enable our users to even more naturally schedule their robot to clean, like simply asking to clean the kitchen and dining room every weekday at 8 a.m. We know our customers will love new voice experiences like this.”

Philosophical Creations founder Steven Arkonovich saw a chance to improve the interactions for his Big Sky skill, giving customers more freedom in how they ask for hyperlocal weather information.

“Alexa Conversations promises to be a breakthrough for developers writing Alexa skills. And, more importantly, it will create great new experiences for customers,” says Arkonovich. “I can provide this experience by supplying dialog and without writing lots of code. Alexa’s AI generates sample utterances and keeps track of the context, all with very little input from my skill code. Even better, I can use Alexa conversations to extend my current skill without rewriting the entire codebase. Users can speak more naturally, or can change their minds mid-conversation, and Alexa will just keep up.”

Arrive offers parking automation solutions with Alexa to help their customers find, book, pay for and navigate to thousands of parking spaces. Alexa Conversations helps make Arrive’s in-car experience more functional and satisfying without changing the existing code.

“We're very excited for the potential of Alexa Conversations to improve our skill experience by training dialog models with real user interactions,” says Jeff Judge, CTO at Arrive. “We envision a future where skill developers can focus on delivering the most meaningful content within their skill, leaving the heavy lifting of input processing to the Alexa Conversations engine. That's a huge step forward.”

Register for the Alexa Skills Challenge: Alexa Conversations

Today we are also announcing a new Alexa Skills Challenge: Alexa Conversations - an opportunity for advanced voice developers to compete for more than $100,000 in cash prizes. Participants can enter to compete for the $20,000 Grand Prize, one of ten $5,000 finalist prizes, and one of eight $2,000 bonus category prizes. Register for the challenge today and submit your entries before September 14, 2020. Don’t wait! The top 300 published submissions will receive a $50 Amazon gift card.

Get Started with Alexa Conversations (beta)

Visit the Alexa Developer Console to enable the Alexa Conversations beta and publish skills in the en_US locale.

More Information
Related Articles
31 New Features to Unlock More Natural and Immersive Alexa Experiences
Enter the Alexa Skills Challenge: Alexa Conversations to Compete for Over $100,000 in Prizes
Reach More Customers with Quick Links for Alexa (Beta) and New In-Skill Purchasing Options