Your skill’s interaction model is not only the backbone of your skill for supporting the customer experience—it’s also your skill’s entire skeletal system. It drives your skill’s ability to provide customers with relevant responses to their requests.
We regularly talk to developers about building a strong interaction model to optimize skill accuracy, helping customers get to the right intents. Here are a few best practices you can incorporate into your skills to improve customer engagement.
Write out sample utterances that will cover a variety of different things customers might say to access each of your intents, including different phrase patterns. Once you’ve decided what functions your skill will support and mapped out your skill’s intents, consider how a diverse set of customers will interact with your skill. If you’re not sure where to start, ask your friends and family how they might use the skill, and use those phrases as a starting point. For example, different customers might ask the following to a skill that provides market updates:
Additionally, consider how your skill functions. Are customers able to continue to interact with it for an extended period of time? If so, how is your skill prompting customers to respond? If a skill prompts a user for a location value with “what city are you in?”, people may respond with “Seattle,” “I’m in Boise,” or “I want weather for Boston,” so be sure to support all of these variations in your interaction model.
If your skill supports one-shot requests in addition to in-skill requests, be sure to include variations that will work for the supported launch phrasing. For instance, for a weather skill, customers may say “ask <invocation name> what the weather is.” Some sample utterances may work for both one-shot and in-skill requests – for example, the sample utterance “I want the weather please” would make sense for customers to say by itself or as part of the request “tell <invocation name> that I want the weather please.”
When building your intents, try to balance the number of sample utterances across all intents where possible. If one intent has 25 sample utterances, and another intent has only five sample utterances, the intent with 25 sample utterances may handle more than its fair share of user utterances. Additionally, make sure to include sample utterances with a variety of different word lengths if customers are likely to say sample utterances of varying lengths. If all 25 sample utterances in your intent are made up of just one word apiece, it may not be routing longer requests to it properly. One way to support sample utterances like this is to include polite variations of your skill’s sample utterances with words like “please” and “thank you.” Make sure that if you choose to add polite forms of sample utterances that they are included consistently across all or most of your skill’s intents, not just in a single intent, so that customers’ responses are routed consistently.
In line with the above, a fox fact skill could have sample utterances like:
Include matching sample utterances for anything you tell customers to say in your skill’s prompts, the Alexa app, and any skill marketing materials. These example phrases are what customers see or hear when searching for and interacting with your skill, so it’s important that they deliver the proper response to customers. Including a matching sample utterance for all of these phrases, as well as common variations customers might say, will ensure that these phrases will work consistently. For example, if your skill’s help prompt says “You can say, ‘how many days until my birthday?’”, make sure “how many days until my birthday” is covered in your sample utterances or slot values so that it will work consistently.
When testing out utterances on either your device or in the Alexa Developer Portal, it’s possible that many of these phrases are being routed properly already. But changes to Alexa’s language models later on may cause these utterances to stop working, unless they are explicitly covered in the skill’s sample utterances.
Including the same, or very similar, sample utterances in multiple intents can cause confusion regarding where to send that request, leading to inconsistent user experiences and frustrated customers. Intent confusion can be caused by something seemingly minor, like extending AMAZON.CancelIntent with the sample utterance “stop” when you’re also using AMAZON.StopIntent, which also includes this sample utterance.
However, it can also be caused by the same slot values appearing in multiple slots, and those slots being used in the same way – such as including “one” in an Answer and a PlayerNumber slot in a quiz skill. To prevent this issue with slots, ensure that they are being used with unique carrier phrases, or the words around the slot. For example, “{Answer}” and “{PlayerNumber} players” in different intents won’t cause this problem, but “{Answer}” and “{PlayerNumber}” appearing by themselves in separate intents would.
Finally, if slot values are appearing outside of slots in sample utterances, it can also create intent confusion. If a weather skill includes the sample utterance “what will the weather be tomorrow” in a WeatherTomorrowIntent, but also has the sample utterance “what will the weather be {Date}” (where {Date} makes use of the AMAZON.DATE slot type) in a WeatherDateIntent, these two sample utterances can be the same. If a slot is being used to cover a word and other words, it is best practice to remove the version without the slot as it covers fewer variations – in this case “what will the weather be tomorrow.”
Ensuring that slot values and sample utterances are properly tokenized is essential to a strong interaction model. This includes using punctuation and special symbols only when necessary, and writing values in spoken form where appropriate. One of the most important questions to ask yourself during this step is, “How would customers say this out loud?”
Any abbreviations, such as “dr.” or “jr.” should be written out in their spoken form – that is, “doctor” and “junior.” If customers are likely to say the abbreviation as well as the full word (such as “Pepsi Co.” and “Pepsi Company”), consider supporting both variations in your slot values.
Acronyms should only be written in lowercase letters separated by periods and spaces (“a. s. a. p.”) in sample utterances.
Similar to the above, if customers are likely to say these acronyms as words, too, consider supporting both variations (“asap” and “a. s. a. p.”) to support both forms. In slot values, acronyms can also be written in all capital letters, not separated by periods and spaces (“ASAP”).
Finally, punctuation and special characters should be used sparingly, if at all. Do not include question marks, exclamation points, ampersands, at symbols, or other special characters in slot values, even when referencing entities that include these things (for example, the singer “P!nk” should be written out as “Pink”).
The exception here is for non-English locales, such as Spanish, Italian, and French: accented characters should be included wherever they are needed. For example, “adiós” instead of “adios.” Additionally, German skills should utilize umlauts (for example, “büro” instead of “buero”) and the sharp S (for example, “fußball” instead of “fussball”) wherever necessary. German skills should also use compounded words where appropriate.
Any sample utterances that are meant to be said as part of a one-shot request should not begin with a connecting word – these connecting words are automatically supported by the system and do not get sent to your skill. If a user says, “ask my dice roller to roll the dice,” the request “roll the dice” is sent to the skill, so that should be the sample utterance included to support this interaction. Including the connecting word in cases like this will reduce system confidence and in the most severe cases may lead these utterances to be misrouted.
For an interaction like the following, only the sample utterances highlighted in green should be included in the interaction model, as the rest begin with connecting words:
The exception to this is if customers are likely to say an utterance that starts with a connecting word inside the skill, that should be included in your skill’s data. For example, if your skill asks “Where are you travelling to?”, customers might respond “to San Francisco,” in which case the connecting word is not automatically supported.
Leveraging your skill’s interaction model will improve your skill’s accuracy, lead to better skill experiences for your customers, and make your skill more engaging. Once you’ve created or updated your skill’s interaction model, make sure to resubmit the skill for certification so that these changes will go live in your skill.
Bring your big idea to life with Alexa and earn perks through our milestone-based developer promotion. US developers, publish your first Alexa skill and earn a custom Alexa developer t-shirt. Publish a skill for Alexa-enabled devices with screens and earn an Echo Spot. Publish a skill using the Gadgets Skill API and earn a 2-pack of Echo Buttons. If you're not in the US, check out our promotions in Canada, the UK, Germany, Japan, France, Australia, and India. Learn more about our promotion and start building today.