Dialog Design for Alexa Conversations

Think back to conversations you’ve had with a person that have felt natural. Those conversations are often cooperative and constantly moving forward without friction. You don’t think about what must happen next, you simply talk. Replicating this naturalness in conversations between your users and Alexa is a hard problem to solve. Conversations between people are nuanced; they're unscripted, and can span many topics. Alexa Conversations handles much of this nuance for you.

The following sections explain how Alexa Conversations works, its awareness of user behaviors and expectations, and some best practices. Use this content as a guide to help you design natural, seamless sample dialogs.

How Alexa Conversations works

Alexa Conversations is an artificial intelligence (AI) driven engine that you can train with sample dialogs to create simulated dialogs. The simulated dialogs represent a wide range of user responses in a turn-based format between the user and Alexa. The AI engine outputs a variety of utterances, understands pronouns, carries over context, handles over and under-answering, and adjusts for user corrections.

About user behavior

Alexa Conversations has significant understanding of user behavior. If your sample dialogs represent a few examples of these behaviors, Alexa Conversations ensures that your skill is able to handle a wide variety of possible user inputs.

Users say a variety of phrases to express the same idea

There’s what a person says and then there’s the intent behind what they say. A user’s background, environment, and state of mind are just some of the factors to consider.

I'd like to order a cake.

Get me a cake, please.

Users employ pronouns and carry-over context

When people speak, they often jump ahead in a conversation or refer back to a previous discussion point. As the conversation goes on, they refer back to a previous topic by using pronouns such as "it" or "that" to avoid repetition. Alexa Conversations handles the inferred context so that the dialog can continue in a natural flow.

I'd like a chocolate half sheet cake.

Got it. What type of icing would you like?

How many people can that feed?

A half sheet cake feeds about 20 people.

Make it a full sheet cake.

Chocolate full sheet cake, got it. What type of icing would you like?

Users over-answer a question

Many users encounter voice experiences that are inflexible in a conversation; limited to one topic or question per turn. Users can find this experience time consuming and frustrating because they want to give extra information or they anticipate further related questions.

The following example shows how Alexa Conversations handles over answering. In this case, the skill asks for the size but the user gives more information (size and flavor). With Alexa Conversations, the skill takes the user’s input, and then moves on to gathering the next piece of information.

What size cake do you want?

I want a chocolate half sheet cake with buttercream icing.

Chocolate half sheet cake with buttercream, got it. Would you like anything else?

Users under-answer a question

Occasionally, users might misunderstand or only hear part of a question. To help move the conversation along, Alexa takes the information given and prompts the user for any missing pieces.

The following example shows how Alexa Conversations handles under answering. In this case, the skill asks for a size and flavor but the user only responds with a flavor. Alexa then asks for the missing size to fill in the gap.

Let me know the size and flavor of your first cake.

Chocolate.

And what size?

Half sheet.

Users correct themselves

Conversations aren’t linear. People might answer and then change their mind. In these circumstances, a user doesn’t have to leave the skill and come back. Instead, they can simply course correct in one step without a problem. The following example shows a user changing their mind on the size cake.

I have a chocolate half sheet cake with buttercream icing. Would you like anything else?

Actually, make that a full size.

Ok. I have a chocolate full sheet cake with buttercream icing. Anything else?

User behavior quick reference

Use the following quick reference table to guide you in the type of user-behavior dialogs you can design when you train an Alexa Conversations model.

Users..	Example dialog
Say a variety of phrases for the same idea	User: "I'd like to order a cake", "Get me a cake, please."
Employ pronouns and carry over context	User: I'd like a chocolate half sheet cake. Alexa: Got it. What type of icing would you like? User: How many people can that feed? Alexa: A half sheet cake feeds about 20 people.
Over-answer a question	Alexa: What size cake do you want? User: I want a chocolate half sheet cake with buttercream icing. Alexa: Chocolate half sheet cake with buttercream, got it. Would you like anything else?
Under-answer a question	Alexa: Let me know the size and flavor of your first cake. User: Chocolate. Alexa: And what size?
Correct themselves	Alexa: I have a chocolate half sheet cake with buttercream icing. Would you like anything else? User: Actually, make that a full size. Alexa: Ok. I have a chocolate full sheet cake with buttercream icing. Anything else?

About user expectations

Alexa Conversations knows about these user expectations. Knowledge of these expectations gives you a grasp of the types of responses you can design when training a model.

Users expect Alexa to vary its responses

Providing variety in Alexa’s responses keeps the conversation delightful and avoids sounding robotic and boring. For example, here are some variations on giving the daily specials:

We have all types of great cakes for you today.

There are so many great specials for you!

Users expect Alexa to confirm that it heard the right utterance

Certain high-risk scenarios, such as purchasing an item, require explicit confirmation to confirm a user’s choice. Alexa can confirm that it has heard the right thing and if not, re-prompt the user to answer the question.

I’d like to purchase the cheesecake special.

Okay, that's one order of the cheesecake special, right?

Yes.

Users expect Alexa to store and retrieve data

Unless the question is rhetorical, most people ask a question with the understanding that the other party can answer it. If a user is asking a question of your skill, it’s because they believe the skill can answer their question. The following example shows a return user who asks for specials. Because this user has used the skill before, Alexa has stored previous order information into an order history API and uses that information when invoking specials from a different API. Alexa hides this complexity from the user.

What are the specials today?

We have a lot of great specials! Based off your previous orders, you might like our new German chocolate cake.

Users expect Alexa to store and retrieve data

When a user completes a goal, it can be useful to offer proactive information that engages and encourages the user to start a new goal. For example, in the following skill, a user has finished ordering a cake for delivery and then offers to track the order.

Does that complete your order?

Yes.

Great. One cheesecake on its way. It should be there in 30 minutes. Would you like me to track the order?

User expectation quick reference

Use this quick sheet to guide you in the type of Alexa responses you can design with Alexa Conversations

Customers expect Alexa to...	Example Dialog
Provide variety	Alexa: We have all types of great cakes for you today. Alexa: There’s so many great specials for you!
Confirm that it heard the user correctly	User: I’d like to purchase the cheesecake special. Alexa: Okay, that's one order of the cheesecake special, right? User: Yes
Remember information about a user	User: What are the specials today? Alexa: We have a lot of great specials! Based off your previous orders, you might like our new German chocolate cake.
Keep the conversation going	Alexa: Does that complete your order? User: Yes Alexa: Great. One cheesecake on its way. It should be there in 30 minutes. Would you like me to track the order?

Considerations before you write dialogs

Before writing out your dialogs, it's important to examine your skill in a larger context. Answer the following questions to be your roadmap when you start to design your dialogs.

What’s the purpose of your skill?
The purpose is the guiding principle that your dialogs should align to that trains the AI. For example, the cake skill lets you order custom cakes delivered to your door from the comfort of home.

What are the user’s goals?
What do they get out of this experience? Goals help you identify your core dialogs. For example, users can order a pizza pie with preset options or customize a pie to their liking.

Golden dialogs

After you have a clear understanding of your purpose and goals, you can begin writing out your golden dialogs. Golden dialogs outline the optimal user journey (also known as the "happy path") of expected conversations that will achieve the user's goals.

Here’s an example of a golden dialog for a user ordering a cake:

I'd like a chocolate half sheet cake.

Got it. What type of icing would you like?

Butter cream.

Got it. That’s one chocolate half sheet cake with butter cream icing.

Divergent dialogs

A conversation can diverge from the original topic in many ways. Alexa Conversations can handle switching context or correcting a mistake, but it’s always helpful to write out divergent dialogs. They help train the understanding engine for a multitude of use cases where a user takes an unexpected path to achieve their goals.

Some questions to keep in mind when you design divergent dialogs:

What’s the most likely scenario to design for where the user diverges from the golden dialog path?
What if the user says the opposite of what you expect (for example, "yes" instead of "no")?
What happens if Alexa fails to achieve the user’s goal?
What if the user didn’t respond to a prompt?

Here are a couple of examples for divergent dialogs when a user orders a cake:

User doesn't respond

I'd like a chocolate half sheet cake.

Got it. What type of icing would you like?

(no response)

We have a variety of icings such as butter cream, ganache, and fondant. What type of icing can I get for you?

System fails

I'd like a chocolate half sheet cake.

Unfortunately, it looks like our ordering system is down. Would you like me to call the restaurant for you?

Best practices for writing dialogs

The more sample dialogs, the better – It's always useful to design different variations of user utterances and dialogs. The usability-testing phase of a project is a great way to capture utterances that you didn’t envision yourself.
Provide just enough context – When a user under-answers a question, use a similar structure in the re-prompt as in the original prompt. This technique avoids user confusion and keeps the conversation moving along.
Use confirmations sparingly – Use confirmations only in scenarios where there would be an excessive burden on the user if Alexa misheard. Ask yourself, “Will a user want to wait to hear that this action is confirmed explicitly?”
Maintain trust - Trust in your skill is important. Make sure that your skill retrieves data from trustworthy sources that are conversational in tone, rather than a robotic output.

Do

Respond with variety - Providing variety in Alexa’s responses keeps the conversation delightful and avoids sounding robotic and boring.

Don't

Respond with an inconsistent voice - Your skill’s personality is the conversational partner of your user. Having a consistent personality makes it so that a user knows they’re talking to a singular voice. Responding outside of that personality can be jarring.

Do

You don’t need to write out every possible variation. Your dialogs are samples of what a user might say to help train the AI for a more natural, flexible conversation that can switch context easily. The AI figures out the variations for you.

Don't

Overly design for corrections or context switching - The AI handles these elements for you as long as you provide a few sample dialogs to train it. If something’s wrong, go back and fix your core dialogs used to train the AI.

Do

Ask open-ended questions whenever possible - When a user knows the choices that are available to them, ask about the choices openly. Alexa Conversations can fill in multiple slots with a single utterance. A good question to ask yourself, “Would a user know what the options are if I don’t list out a few?”

Don't

Hand-hold through the experience - Users often expect voice experiences to be constrained, step-based experiences. They think they must answer a game of 20 questions, and then they eventually get to the end goal. Change expectations; ask open-ended questions.

Do

Proactively offer more when a user completes a goal- With Alexa Conversations, you can proactively engage users about other features to use that they might not discover on their own.

Don't

Offer more when the user is acting on a goal- Nobody likes to be bothered when they are trying to focus. Don’t complicate things when a user is trying to act on a goal already.