Alexa Haus landing page

Natural Speech

Key takeaways

To create a natural-sounding conversation that pleases your customers, you’ll need to write dialogs for your skill that are meant to be spoken, not read.

 

Need quick advice?

View the full design checklist for natural speech.

 

In this article:

line-break

Writing for the ear

We have to write dialogs that aren’t meant to be read, but heard in spoken conversation. Conversations that customers have with your skill should feel as close to a telephone conversation with a friend as possible. When the dialog flows naturally and is easy to understand, a customer is more likely to continue using your skill over time. We don’t speak in a way that sounds like the written language we’re used to reading. Human conversation partners take turns, pick up on cues to listen or speak, use contractions, fragments, and shorthand.

 

Checklist for taking turn & using conversation cues:

▢  Ask a question only immediately before opening the microphone

▢  Use conversation cues that indicate progress on a multi-step task

▢  Use implicit, explicit, and contextual cues to indicate Alexa understood the customer

▢  Display progress cues on Alexa devices with a screen

line-break

Ask questions

When people have a multi-turn conversation with the goal of completing a task or obtaining some kind of output such as paying a bill or searching for a flight, they will exchange pleasantries, set an agenda or establish the goal, and exchange any necessary information needed to complete the task.

First, the best way to ensure the customer knows the skill needs their input is to ask them a question. Questions imply it’s the customer’s turn to speak, and states the information needed from the customer more clearly than a statement might. People tend to answer a question immediately upon being asked, so the skill should ask the question at the end of its responses to avoid interruptions and errors. For example, listen to these two conversations below that ask questions (or don’t) much differently.

 

Don't:  

 

Customer: Alexa, ask MyFlightFinder if my flight is on time

Alexa: You can give me the flight details and I will tell you the status of your flight.

<opens mic>

Customer: Uh … It’s to Anchorage. Today. At noon. And I’m leaving from––

Alexa: Anchorage, Alaska. Got it. When is the flight? I

Customer: Today at noo– (talking over)

Alexa: I will need the date and time.

Do:  

 

Customer: Alexa, ask MyFlightFinder if my flight is on time.

Alexa: To find your trip with MyFlightFinder, I have a few questions. First, where are you going?

Customer: Anchorage.

Alexa: Anchorage, Alaska. Got it. When are you going?

Customer: Today at noon.

Use progress markers like "first", "got it", "almost done", to give customers confidence the process won’t take long, how far along they are, and that the skill understood everything they said.

 

Do:

 

Customer: Alexa, ask MyFlightFinder if my flight is on time

Alexa: To find your trip with MyFlightFinder, I have a few questions. First, where are you going?

Customer: Anchorage.

Alexa: Anchorage, Alaska. Got it. When are you going?

Customer: Today at noon

Alexa: Ok. Today at 12 pm. Almost done. From which airport are you leaving?

Customer: I’m departing from Seattle

Alexa: Alaska Flight 10 from Seattle to Anchorage, leaving today at 12 pm, is delayed. Its new departure time is today at 2 pm. Want to check another trip?

Conversely, listen to this conversation that doesn’t include any such cues. The customer won’t know when the conversation will end, or if the Alexa understood them:

 

Don't:  

 

Customer: Alexa, ask MyFlightFinder if my flight is on time

Alexa: Where are you going?

Customer: Anchorage.

Alexa: When are you going?

Customer: Today at noon

Alexa: From which airport are you leaving?

Customer: I’m departing from Seattle

Alexa: That flight is delayed. Its new departure time is at 2 pm. Want to check another trip?

Who knows if that was the right flight? Our customer wouldn’t.

Additionally, your skill should use the screen to add more context, like hints, or a progress bar.

line-break

Keep contextual

Use what your customer has already said in the conversation at hand (and previous ones, where possible), to move the conversation along more quickly and avoid becoming redundant (or, sounding incompetent).

Checklist for a contextually relevant conversation:

▢  Allow the customer to over-answer; don’t make them repeat this information

▢  Allow the customer to under-answer; follow up for additional information as needed

▢  Hold context within the conversation; don’t make the customer re-state or re-confirm details

▢  Hold some context across multiple sessions; Demonstrate your skill remembers relevant details of previous conversations and use them to expedite or enhance an experience

▢  Don’t state the obvious

▢  Display additional context on Alexa devices with a screen

 

Imagine if in our conversation with the hypothetical MyFlightFinder, the skill didn’t listen to what the customer said if they over-answer one of its questions.

Don't:

Customer: Alexa, ask MyFlightFinder if my flight to Anchorage at noon is on time

Alexa: To find your trip with MyFlightFinder, I have a few questions. First, where are you going?

Customer: I said Anchorage!

Alexa: Anchorage, Alaska. Got it. What time are you going?

Customer: today at noon!

Alexa: Noon on what day?

Customer: I said today! Today at noon!

Alexa: Today at 12 pm, got it. Almost done. From which airport are you departing?

Customer: Seattle

Alexa:Alaska flight 10 from Seatac International Airport to Anchorage leaving today at 12 pm is delayed. Its new departure time is at 2 pm. Want to check another trip?

Instead, that conversation could have moved along with less friction if the skill allowed the customer to over-answer.

 

Do:

 

Customer: Alexa, ask MyFlightFinder if my flight to Anchorage at noon is on time

Alexa: Welcome to MyFlightFinder. Looking for flights to Anchorage Alaska, today at 12 pm. What airport are you leaving from?

Customer: Seattle

Alexa:Alaska flight 10, from Seattle to Anchorage, leaving today at 12 pm, is delayed. Its new departure time is at 2 pm. Want to check another trip?

Remember customer's responses not just within the current session, but across multiple sessions over time. This builds customer trust and rapport with your skill. What if our MyFlightFinder customer came back to the skill later that day? Chances are good that the flight they want to track will be the same one they just tracked. Here’s how we might greet them:

 

Customer: Alexa, ask MyFlightFinder where my flight is

Alexa: Welcome back to MyFlightFinder. Your flight from Seattle to Anchorage is still delayed, and will depart today at 2 pm. Want to check another trip?

line-break

Use casual style & grammar

Checklist for using casual grammar & style:

▢  Speak in active voice

▢  Use contractions

▢  Avoid jargon and formal vocabulary

▢  Vary the responses customers will hear the most

▢  Don’t state the obvious

▢  Use the fewest words to convey the most meaning

 

In order to maximize the customer's engagement and focus, your skill should speak as if it's in a casual conversation. Speak directly and efficiently, and avoid the use of jargon or formal vocabulary that may be unfamiliar or confusing for the customer. Use the active voice and contractions where possible to make your skill’s responses sound more natural.


Active voice always puts the noun before the verb, and it usually results in more brief responses.

 

Do:

 

Alexa: Hmm. I’m having trouble with your payment. You can update your account details on your MyCapital app.

Don't:

Alexa: Hmm. Your payment could not be processed at this time. The details of your account need to be updated by you in your MyCapital app.

Be brief, direct and unambiguous; make every word count. Each word and phrase of the skill’s speech should serve a purpose and carry as much useful information to the listener as possible. Watch for weasel words and phrases: Verbiage that doesn’t add value to the response except to lengthen it (or worse, make the content more ambiguous). For example:

 

Do:

 

Alexa: I’m having trouble with your payment.

Don't:  

 

Alexa: A fatal error occurred while I was processing your order.

Similarly, avoid stating the obvious; omit information that a reasonable person would assume. Listen to these indirect and ambiguous instructions in a basic (and hypothetical) trivia game.

 

Don't:

 

Customer: Alexa, start Seattle SuperTrivia

Alexa: The Seattle Super Trivia skill is a trivia game about the city of Seattle. Are you ready to start the game?

Customer: Yes

Alexa: Since you have never played before, I have a few instructions to help you along the way. You can play a short trivia game every day of the week. If you get a perfect score you will earn a bonus trivia question until you fail. You will earn one point for each question answered correctly. Incorrect answers don’t earn any points. You will receive a score at the end. You can ask me for a hint when you need it. Do you want to try a practice question?

While long-winded, those instructions weren’t very specific, yet included details anyone who has ever played a game would be able to assume. A more direct way to write these instructions might sound like the following:

 

Do:

 

Customer: Alexa, start Seattle SuperTrivia

Alexa: Welcome to Seattle Super Trivia. Think you know Seattle? Think again. Ready to start?

Customer: Yes

Alexa: Each day I’ll have five fresh trivia questions for you. A perfect score earns you a sudden-death bonus round, where you can keep playing as long as you keep winning. You can ask me for one hint each day. Let’s try a practice question.

While it isn’t a complete list, watch for some of the following formal terms in your conversation and choose more natural alternatives where possible to improve your dialogs.

Formal Term Alternatives Do/Don’t Examples
Activate, Deactivate Start, begin, stop, cancel Ok, I will deactivate your subscription. / Ok, I’ve canceled your subscription
Available, Unavailable free, open, “I can…”; closed, “I can’t …” There are no available classes to meet your criteria. / I didn’t find any open classes like that.
Capable, Capability, Incapable “I can/can’t…” (See next item)
Determine, Determined, Undetermined find, could/couldn’t I can’t determine the flight you asked for. Please try again. / I didn’t find any flights like that. Let’s try again …"
Request “what you asked …” “that” “I can’t process your request” / “Hmm. I can’t do that.”
Locate find I couldn’t locate a parking spot. / I couldn’t find a parking spot

line-break

Vary your responses

Your skill’s features should give your customers a reason to re-invoke the skill again often. Consider how likely they’ll come back once a month, once a week, many times a week, or even many times a day. What messages will they hear most often? Which will become tiresome to listen to each time? Consider what information or instruction your skill might give to new customers that they won’t need once they become familiar with its functions. The following steps and elements of your conversation are most at risk of becoming tired, weird, or irritating over time without variation:

 

Greetings and pleasantries

Skills should vary any pleasantries, such as the way it says “hello” or “Let’s start your flight search,” or “goodbye.” Similarly vary the way your skill says “goodbye” to your customer. As discussed in Design Principles: Be Contextual, skills should greet their customers differently based on a number of contexts:

  • the customer has never used the skill before
  • they’re a longtime user
  • they haven’t linked an account with essential info
  • they had an action in progress in their previous session,
  • and more, depending on the functions of your skill.


Repeats/reprompts

You might consider varying the way your skill repeats questions (re-prompts) to shorten or remove the message that precedes it, since the customer has already heard it once. You can learn more about handing errors in the next section, Design Patterns.


Do:

 

Customer: Alexa, ask MyFlightFinder if my flight to Anchorage at noon is on time

Alexa: Welcome to My Flight Finder. Looking for flights to Anchorage Alaska, today at 12 pm. What airport are you leaving from?

Customer: …

Alexa: I didn’t catch that. What airport are you leaving from?

Don't:

 

Customer: Alexa, ask MyFlightFinder if my flight to Anchorage at noon is on time

Alexa: Welcome to My Flight Finder. Looking for flights to Anchorage Alaska, today at 12 pm. What airport are you leaving from?

Customer: …

Alexa: Welcome to My Flight Finder. Looking for flights to Anchorage Alaska, today at 12 pm. What airport are you leaving from?

Confirmations

Vary the quick confirmations your skill will use, such as “Ok.” “Got it” “Thanks” “noted” and more to confirm the skill heard what the customer said.

 

Instructions and hints

Include instructions, hints and additional help only for first-time users. Repeat customers should get a streamlined, quick experience.

 

Vary on-screen responses

Where customers will see a screen often, you might consider varying static images, cycling through a number of “hints” to using your skill instead of always displaying one, and varying greetings and pleasantries in the same way you did for your verbal responses. Consider seasonality as well. For example, is a child asks the Norad Tracks Santa Skill to find Santa in July, the skill will respond that Santa is on vacation, and display an image of Santa at the beach or camping.

line-break

Prevent errors

Of course the best way to handle errors is to prevent them in the first place. There are a few ways you can design your skill’s dialog to prevent common errors from occurring.

Ask a question only immediately before opening the microphone for input to prevent the customer speaking over Alexa. 

 

Do:

 

Alexa: You can ask me about a city, state, province, or region. Where do you want to go?

Don't:

 

Alexa: Where do you want to go? You can ask me about a city, state, province, or region.

Present simple options

While you should try to be as flexible as possible with the way your skill supports a range of customer requests, you can optimize for simplicity when there is a risk you will get such a large range of answers that it would be burdensome (or impossible) to support.

 

Do:

 

Alexa: From Seattle to Melbourne, I found two flights departing today. What time are you leaving?

Risky:

 

Alexa: I found two flights departing today from Seattle to Melbourne. You can tell me what time you are leaving and the name of the airline. What would you like?

Don't:

 

Alexa: I found two flights departing today from Seattle to Melbourne. How can I help?

Read more about best practices for handling lists in the Patterns & Components.

Still, conversational errors will happen, just as they do in human-to-human conversation. The next section describes some actions skills can take that will help recover from those errors. Some involve the content of the message itself, and some the way the customer is routed through steps to correct the mistake. 

line-break

Handle errors gracefully

Checklist for handling errors gracefully:

▢  Use plain language and simple prompts to reduce errors

▢  Explain to the customer what went wrong using the best available data

▢  Clarify the information needed and why it’s needed to continue

▢  Anticipate questions that won’t be supported in the skill and create an answer or redirect to another source

▢  Offer a contextually relevant response when the customer asks for help


If there is one area of natural speech that is most unlike conversations with Alexa, it’s the way Alexa skills must actively work to understand and help the customer resolve errors. Many of the below types of errors aren’t encountered in our everyday conversations, and we almost never reach an error in human conversation that we can’t resolve.

When two people having a conversation make a mistake or recognize a misunderstanding, they say different things to overcome the error: nobody continues repeating what they said verbatim at one another, hoping something will change. Similarly, your skill should respond differently to a customer when they reach an error, depending on the cause of the error and how many times they’ve reached it.

Take escalating steps to resolve the error. A first-time error might have been a simple mistake, such as the customer stopped paying attention and simply needs your skill to repeat the question. If they reach the same error again, they might still be thinking, or they may be confused about their choices or the question the skill asked them, so a second error may require more explanation from the skill. Repeated errors indicate the customer is either in need of additional help, or, in the case a skill repeatedly receives no response form the customer, they’ve abandoned the skill and it should end the session.

Where possible, ask the customer disambiguating questions if the skill needs more information, rather than requiring they provide complete information in a specific format. Allow customers to provide incomplete information, and follow up on it as needed. (“Did you mean 12 noon, or 12 midnight?”)

Listen to the following examples of a skill attempting to resolve errors using (or not using) the methods above, with varying levels of success.

 

Bad:

 

Customer: Alexa, ask MyFlightFinder where my flight is.

Alexa: Welcome to MyFlightFinder. To find your trip, I have a few questions. First, where are you going?

Customer: Melbourne

Alexa: Hmm. I’m not sure about that one. Where are you going?

Customer: Melbourne!

Alexa: Hmm. I’m not sure about that one. Where are you going?

Customer: I’m going to Melbourne, Florida!

Alexa: Melbourne, Florida. Got it. When are you going?

Customer: …

Alexa: Melbourne, Florida. Got it. When are you going?

Customer: …

Alexa: Melbourne, Florida. Got it. When are you going?

Customer: …

Alexa: Melbourne, Florida. Got it. When are you going?

Customer: It’s today sometime

Alexa: I’m sorry. You need to tell me your departure date and time. When are you going?

Customer: Nevermind. Exit

A little better:

 

Customer: Alexa, ask MyFlightFinder where my flight is.

Alexa: Welcome to MyFlightFinder. To find your trip, I have a few questions. First, where are you going?

Customer: Melbourne

Alexa:I’ll need to know the city, and state or province of your destination. Where are you going?

Customer: Melbourne, Florida.

Alexa: Melbourne, Florida. Got it. When are you going?

Customer: …

Alexa: I didn’t catch that. When are you going?

Customer: …

Alexa: I need your departure date and time to find your flight. When are you going?

Customer: It’s today sometime

Alexa: Today at what time?

Customer: I don’t know

Alexa: Ok. From which airport are you leaving?

Customer: Seattle

Alexa: I found two flights today from Seatac International Airport to Melbourne, Florida. The first one departs at 12 pm. The Second one departs at 7 pm. Which do you want?

Customer: Oh it’s the one at noon

Alexa: Hmm. I’m not sure about that. I found two flights today from Seatac International Airport to Melbourne, Florida. The first one departs at 12 pm. The Second one departs at 7 pm. Which do you want?

Customer: The first one

Alexa: Hmm. I’m not sure about that. I found two flights today from Seatac International Airport to Melbourne, Florida. The first one departs at 12 pm. The Second one departs at 7 pm. Which do you want?

Customer: 12 PM!

Alexa: The flight from Seatac International Airport to Anchorage leaving today at 12 pm is delayed. There is no updated departure time available. Want to check another trip?

Best:

 

Customer: Alexa, ask MyFlightFinder where my flight is.

Alexa: To find your trip with MyFlightFinder, I have a few questions. First, where are you going?

Customer: Melbourne

Alexa: Is that Melbourne in Australia, or Florida?

Customer: Florida please

Alexa: Melbourne, Florida. Got it. When are you going?

Customer: …

Alexa: I didn’t catch that. When are you going to Melbourne, Florida?

Customer: …

Alexa: I didn’t catch that. To find your flight, I’ll need to know a date or time you’ll be going. When are you going?

Customer: It’s today sometime

Alexa: Today. Ok. Which airport are you leaving from?

Customer: Seattle

Alexa: I found two flights today, from Seattle to Melbourne, Florida. The first one departs at 12 pm. The Second one departs at 7 pm. Which do you want?

Customer: Oh it’s the one at noon

Alexa: The flight from Seattle, to Melbourne Florida, leaving today at 12 pm, is delayed. It doesn’t have a new departure time yet. Want to check another trip?

You can learn more about handling errors in the Patterns & Components article.

line-break

How to make Alexa sound more natural

The information on this page covers the words that your skill will say, and the manner and style in which they will be said. Another essential part of natural conversations with Alexa is how the customer hears the speech itself. Human speakers effortlessly use the volume, intonation, emphasis, and prosody of their voice to communicate additional meaning and essential context for the listener. Learn how to control these elements and more in your skill’s speech with the Speech Synthesis Markup Language (SSML) Reference.


Listen to the following conversations before and after the use of SSML to change the way the skill delivers the sound of the speech.

 

Before:

 

Customer: Alexa, Start Seattle Super Trivia

Alexa: Welcome to Seattle Super Trivia. It’s OK, you’ve probably never heard of it. Think you know Seattle? Think again, transplant. I’m about to school you! Ready to start?

Customer: Yes

Alexa: Whoah, put that umbrella down. Here’s some tourist information. Each day I’ll have five fresh trivia questions for you. A perfect score earns you a sudden-death bonus round, where you can keep playing as long as you keep winning. You can ask me for one hint each day. Let’s try a practice question.

Alexa: When was the city of Seattle founded? Was it A 1861 B 1902 or C 1792?

Customer: Uh … A. 1861?

Alexa: That’s correct! The City of Seattle was founded in A: 1861. It was a Wednesday. Too bad that one didn’t count. It’s just for practice. Now on to the daily game.

After:

 

Customer: Alexa, Start Seattle Super Trivia

<amazon:emotion name="excited" intensity="medium">Welcome to Seattle Super Trivia. </amazon:emotion>

<amazon:domain name="conversational">It’s OK, you’ve probably never heard of it. Think you know Seattle? Think again transplant. </amazon:domain>

<amazon:emotion name="excited" intensity="medium">I’m about to school you! </amazon:emotion>

<amazon:domain name="conversational">Ready to start?</amazon:domain>

</speak>

Customer: Yes

<speak>

<amazon:emotion name="excited" intensity="medium">Whoah, put that umbrella down.</amazon:emotion>

<amazon:domain name="conversational">Here’s some tourist information. Each day I’ll have five fresh trivia questions for you. A perfect score earns you a sudden-death bonus round, where you can keep playing as long as you keep winning. You can ask me for one hint each day. Let’s try a practice question.

</amazon:domain>

</speak>

Alexa:

<speak>

When was the city of Seattle founded? Was it <break time="250ms"/> A. 1861; B. 1902; or C. 1792?

</speak>

Customer: Uh … A. 1861?

Alexa:

<speak>

<amazon:emotion name="excited" intensity="low"> That’s correct. </amazon:emotion>

<amazon:domain name="news"> The City of Seattle was founded in A: 1861. It was a Wednesday. </amazon:domain>

<amazon:emotion name="disappointed" intensity="high"> Too bad that one didn’t count. It’s just for practice. </amazon:emotion>

<amazon:emotion name="excited" intensity="low"> Now on to the daily game. </amazon:emotion>

</speak>

line-break

Design checklist for natural speech

Checklist for taking turn & using conversation cues:

▢  Ask a question only immediately before opening the microphone

▢  Use conversation cues that indicate progress on a multi-step task

▢  Use implicit, explicit, and contextual cues to indicate Alexa understood the customer

▢  Display progress cues on Alexa devices with a screen

 

Checklist for writing contextual dialog:

▢  Allow the customer to over-answer; Don't make them repeat this information

▢  Allow the customer to under-answer; follow up for additional information as needed

▢  Hold context within the conversation; don't make the customer re-state or re-confirm details

▢  Hold some context across multiple sessions; demonstrate your skill remembers relevant details of previous conversations and uses them to expedite or enhance an experience

▢  Display additional context on Alexa devices with a screen

 

Checklist for using casual grammar & style:

▢  Speak in active voice

▢  Use contractions

▢  Avoid jargon and formal vocabulary

▢  Vary the responses customers will hear the most

▢  Don’t state the obvious

▢  Use the fewest words to convey the most meaning

 

Checklist for handling errors gracefully:

▢  Use plain language and simple prompts to reduce errors

▢  Explain to the customer what went wrong using the best available data

▢  Clarify the information needed and why it’s needed to continue

▢  Anticipate questions that won’t be supported in the skill and create an answer or redirect to another source

▢  Offer a contextually relevant response when the customer asks for help

Next Article: