To create a successful Alexa skill, something to consider is how Spoken Language Understanding (SLU) is interpreted by Alexa. This blog will give you a brief overview of SLU and tips on how you can create proper example phrases, build robust sample utterances, and not duplicate sample utterances or slot values within intents.
How SLU Works
SLU is the process of how Alexa recognizes a user interaction. This interaction begins with the skill’s wake word.
The wake word for an Echo device can be one of the following preset names: Alexa, Amazon, Echo, and Computer. To initiate interaction with an Echo device, a user must first say the wake word. From there, users can then tell Alexa what they would like to do next.
Here is an example of a user utterance that would begin a skill interaction:
“Alexa, ask Horoscope Reader my horoscope for Leo today.”
In this example, “Alexa” is the wake word, “ask” is the launch phrase, “Horoscope Reader” is the skill name, and “my horoscope for Leo today” is the request or utterance.
If the device does not understand the wake word, the user cannot complete their interaction.
From the wake word, the next step in the SLU process is Automatic Speech Recognition (ASR). ASR is when speech from the user is converted into text.
After the speech recognition stage, the top result is passed to Natural Language Understanding (NLU).
NLU is how Alexa interprets the users’ command. If Alexa does not understand the launch phrase or request, she might suggest something similar or come back with an error response like, “Sorry, I didn’t understand that.” For the best user experience, the phrase should be predictable.
The speech invokes the TTS (text-to-speech) service which converts text (or more specifically SSML – the Speech Synthesis Markup Language) into audio (i.e. Alexa’s voice). The speech sends this audio to the device (e.g. Echo Dot), which renders it through its speakers.
How to build a successful skill with SLU in mind
Now that you have an understanding of how SLU works, here are some tips and resources for how you can create proper example phrases, building robust sample utterances, and not duplicate sample utterances or slot values within intents in order to help improve the user experience for your skill.
Example Phrases:
Example phrases are the phrases displayed with the skill in the skills store. They act as suggestions on how to interact with the skill and highlight its key functionalities.
The basic structure of how an example phrase works can be found in our documentation, Understanding How Users Invoke Custom Skills, and can be briefly explained with the following:
Wake word: This is “Alexa” by default on Alexa devices, but can be adjusted by customers based on their preferences.
Launch word: As specified in our documentation, this includes a number of starting phrases including "open," "ask," "start," and more.
Invocation name: This is the invocation name you assigned to your skill under Skill Information within the skill-creation workflow.
Connecting word: These are words used to connect the launch word to utterances and include "and," "to," "for,” “when,” and more. For a full list, please see Understanding How Users Invoke Custom Skills.
Utterance: These are required and should be modeled based on the sample utterances within your interaction model.
It is important to Review and Test Example Phrases so the user experience with the skill is positive. After you have created example phrases, make sure to test each one to ensure the skill can be launched successfully.
Sample Utterances:
While building your skill, you’ll likely have created sample utterances. The usability of a skill is dependent on how well the sample utterances and custom slot values represent natural conversation sentence structure. To dive deeper, see our documentation Best Practices for Sample Utterances and Custom Slot Type Values. Some key points are listed below.
An example of this would be if a user wanted to interact with the Daily Horoscope skill. The developer would have needed to create as many sample utterances they could think of so the interaction is smooth.
For example, users might say one of the following to request a horoscope:
For each intent, include as many variations of the phrases as you expect users to speak. For example, for the utterance "what is my horoscope", include variations such as:
It is better to provide too many samples than to provide too few, so test different phrases and add additional phrases as needed.
Custom Slot Types:
When using custom slot types, it is important to ensure the skill covers any expected dialogue from the user. Tips from our documentation, Best Practices for Sample Utterances and Custom Slot Type Values, include:
Test your utterances, check for conflicts, and revise
As you create your utterances and slot values, it’s possible to duplicate utterances that map to more than one intent in your model. To make sure everything is in order, you can use the developer console to find these utterance conflicts and make any appropriate updates. The utterance profiler and NLU evaluation tool are also great resources that you can use to test your utterances and measure the accuracy of your model.
Some examples of what these utterance conflicts may look like can be found at our documentation Find Utterance Conflicts in Your Model and are outlined below:
Note that conflicts between utterances that use the AMAZON.SearchQuery slot types are not included.
Utterance conflicts are generated after you successfully do a full model build. If any conflicts are detected, the developer console displays an alert. On the Build page, navigate to Custom > Interaction Model > Utterance. From there you will be able to make any necessary updates.
SLU is an integral aspect of understanding how to build a successful Alexa skill. By using the above guides and resources, users will be able to more easily navigate and interact with your skill.