Think back to conversations you’ve had with a person that have felt natural. Those conversations are often cooperative and constantly moving forward without friction. You don’t think about what must happen next, you simply talk. Replicating this naturalness in conversations between your users and Alexa is a hard problem to solve. Conversations between people are nuanced; they're unscripted, and can span many topics. Alexa Conversations handles much of this nuance for you.
The following sections explain how Alexa Conversations works, its awareness of user behaviors and expectations, and some best practices. Use this content as a guide to help you design natural, seamless sample dialogs.
Alexa Conversations is an artificial intelligence (AI) driven engine that you can train with sample dialogs to create simulated dialogs. The simulated dialogs represent a wide range of user responses in a turn-based format between the user and Alexa. The AI engine outputs a variety of utterances, understands pronouns, carries over context, handles over and under-answering, and adjusts for user corrections.
Alexa Conversations has significant understanding of user behavior. If your sample dialogs represent a few examples of these behaviors, Alexa Conversations ensures that your skill is able to handle a wide variety of possible user inputs.
There’s what a person says and then there’s the intent behind what they say. A user’s background, environment, and state of mind are just some of the factors to consider.
I'd like to order a cake.
Get me a cake, please.
When people speak, they often jump ahead in a conversation or refer back to a previous discussion point. As the conversation goes on, they refer back to a previous topic by using pronouns such as "it" or "that" to avoid repetition. Alexa Conversations handles the inferred context so that the dialog can continue in a natural flow.
I'd like a chocolate half sheet cake.
Got it. What type of icing would you like?
How many people can that feed?
A half sheet cake feeds about 20 people.
Make it a full sheet cake.
Chocolate full sheet cake, got it. What type of icing would you like?
Many users encounter voice experiences that are inflexible in a conversation; limited to one topic or question per turn. Users can find this experience time consuming and frustrating because they want to give extra information or they anticipate further related questions.
The following example shows how Alexa Conversations handles over answering. In this case, the skill asks for the size but the user gives more information (size and flavor). With Alexa Conversations, the skill takes the user’s input, and then moves on to gathering the next piece of information.
What size cake do you want?
I want a chocolate half sheet cake with buttercream icing.
Chocolate half sheet cake with buttercream, got it. Would you like anything else?
Occasionally, users might misunderstand or only hear part of a question. To help move the conversation along, Alexa takes the information given and prompts the user for any missing pieces.
The following example shows how Alexa Conversations handles under answering. In this case, the skill asks for a size and flavor but the user only responds with a flavor. Alexa then asks for the missing size to fill in the gap.
Let me know the size and flavor of your first cake.
Chocolate.
And what size?
Half sheet.
Conversations aren’t linear. People might answer and then change their mind. In these circumstances, a user doesn’t have to leave the skill and come back. Instead, they can simply course correct in one step without a problem. The following example shows a user changing their mind on the size cake.
I have a chocolate half sheet cake with buttercream icing. Would you like anything else?
Actually, make that a full size.
Ok. I have a chocolate full sheet cake with buttercream icing. Anything else?
Use the following quick reference table to guide you in the type of user-behavior dialogs you can design when you train an Alexa Conversations model.
Users.. | Example dialog |
---|---|
Say a variety of phrases for the same idea | User: "I'd like to order a cake", "Get me a cake, please." |
Employ pronouns and carry over context | User: I'd like a chocolate half sheet cake. Alexa: Got it. What type of icing would you like? User: How many people can that feed? Alexa: A half sheet cake feeds about 20 people. |
Over-answer a question | Alexa: What size cake do you want? User: I want a chocolate half sheet cake with buttercream icing. Alexa: Chocolate half sheet cake with buttercream, got it. Would you like anything else? |
Under-answer a question | Alexa: Let me know the size and flavor of your first cake. User: Chocolate. Alexa: And what size? |
Correct themselves | Alexa: I have a chocolate half sheet cake with buttercream icing. Would you like anything else? User: Actually, make that a full size. Alexa: Ok. I have a chocolate full sheet cake with buttercream icing. Anything else? |
Alexa Conversations knows about these user expectations. Knowledge of these expectations gives you a grasp of the types of responses you can design when training a model.
Providing variety in Alexa’s responses keeps the conversation delightful and avoids sounding robotic and boring. For example, here are some variations on giving the daily specials:
We have all types of great cakes for you today.
There are so many great specials for you!
Certain high-risk scenarios, such as purchasing an item, require explicit confirmation to confirm a user’s choice. Alexa can confirm that it has heard the right thing and if not, re-prompt the user to answer the question.
I’d like to purchase the cheesecake special.
Okay, that's one order of the cheesecake special, right?
Yes.
Unless the question is rhetorical, most people ask a question with the understanding that the other party can answer it. If a user is asking a question of your skill, it’s because they believe the skill can answer their question. The following example shows a return user who asks for specials. Because this user has used the skill before, Alexa has stored previous order information into an order history API and uses that information when invoking specials from a different API. Alexa hides this complexity from the user.
What are the specials today?
We have a lot of great specials! Based off your previous orders, you might like our new German chocolate cake.
When a user completes a goal, it can be useful to offer proactive information that engages and encourages the user to start a new goal. For example, in the following skill, a user has finished ordering a cake for delivery and then offers to track the order.
Does that complete your order?
Yes.
Great. One cheesecake on its way. It should be there in 30 minutes. Would you like me to track the order?
Use this quick sheet to guide you in the type of Alexa responses you can design with Alexa Conversations
Customers expect Alexa to... | Example Dialog |
---|---|
Provide variety | Alexa: We have all types of great cakes for you today. Alexa: There’s so many great specials for you! |
Confirm that it heard the user correctly | User: I’d like to purchase the cheesecake special. Alexa: Okay, that's one order of the cheesecake special, right? User: Yes |
Remember information about a user | User: What are the specials today? Alexa: We have a lot of great specials! Based off your previous orders, you might like our new German chocolate cake. |
Keep the conversation going | Alexa: Does that complete your order? User: Yes Alexa: Great. One cheesecake on its way. It should be there in 30 minutes. Would you like me to track the order? |
Before writing out your dialogs, it's important to examine your skill in a larger context. Answer the following questions to be your roadmap when you start to design your dialogs.
What’s the purpose of your skill?
The purpose is the guiding principle that your dialogs should align to that trains the AI. For example, the cake skill lets you order custom cakes delivered to your door from the comfort of home.
What are the user’s goals?
What do they get out of this experience? Goals help you identify your core dialogs. For example, users can order a pizza pie with preset options or customize a pie to their liking.
After you have a clear understanding of your purpose and goals, you can begin writing out your golden dialogs. Golden dialogs outline the optimal user journey (also known as the "happy path") of expected conversations that will achieve the user's goals.
Here’s an example of a golden dialog for a user ordering a cake:
I'd like a chocolate half sheet cake.
Got it. What type of icing would you like?
Butter cream.
Got it. That’s one chocolate half sheet cake with butter cream icing.
A conversation can diverge from the original topic in many ways. Alexa Conversations can handle switching context or correcting a mistake, but it’s always helpful to write out divergent dialogs. They help train the understanding engine for a multitude of use cases where a user takes an unexpected path to achieve their goals.
Some questions to keep in mind when you design divergent dialogs:
Here are a couple of examples for divergent dialogs when a user orders a cake:
I'd like a chocolate half sheet cake.
Got it. What type of icing would you like?
(no response)
We have a variety of icings such as butter cream, ganache, and fondant. What type of icing can I get for you?
I'd like a chocolate half sheet cake.
Unfortunately, it looks like our ordering system is down. Would you like me to call the restaurant for you?
Ask open-ended questions whenever possible - When a user knows the choices that are available to them, ask about the choices openly. Alexa Conversations can fill in multiple slots with a single utterance. A good question to ask yourself, “Would a user know what the options are if I don’t list out a few?”