It seems that every time I sit down to write a new skill for Alexa, I think back to all of the skills I’ve built previously. I think about the mistakes I’ve made, the things I didn’t fully understand, and how much I’ve learned since then. When I write a new skill, I learn something new every single time. Today, I want to share some of the things I’ve learned about writing sample utterances for my intents.
For those of you that might not know what an utterance is, let me start there. When a user speaks to your skill, Alexa tries to determine what the user’s intention was, and maps that to a specific block of code in your skill. That block of code is called an “intent.” Each intent requires several sample utterances to be provided so that Alexa has some clues and context to be able to match what a user said with the appropriate intent in your code. Today I’m going to share best practices for thinking about and writing these utterances. For more information, watch this video from the Alexa Voice Design Guide on intents and utterances.
It may seem obvious, but writing down the kinds of conversations your users will have with your skill goes a long way in understanding how to craft your utterances. When you have to use real words, in real sentences, it changes how you think about a user speaking. I always recommend starting with one of our dialog worksheets (simple or advanced) to help you begin understanding your user interactions.
You don’t need to do this for every scenario in your skill, but writing down a few of your “happy path” interactions will go a long way in helping you understand what you need to create an engaging and conversational skill.
It's important to recognize that in-skill utterances are very different from one-shot utterances. This lesson is incredibly easy to forget when you’re creating your interaction model. Here’s an example:
In-skill
Alexa: “What can I help you with?”
User: “Tell me about speechcons.”
One-shot
User: “Alexa, ask Dev Tips about speechcons.”
When I’m building my interaction model, I often forget about the one-shot examples. It’s so easy and familiar to focus on variations of “tell me about speechcons” that I completely forget about how different one-shot utterances can be. An utterance list like this is very common:
“Tell me about speechcons”
“I want to learn about speechcons”
“What are speechcons”
“How do I use speechcons”
In order to represent the one-shot utterances, however, we should include some examples like these:
“About speechcons”
“How to use speechcons”
“What speechcons are”
When the examples above are added to the prefix “Alexa, ask Dev Tips…” they make perfect sense, but they are so easy to overlook when you’re building a skill. Make sure you’re thinking about one-shot utterances for your skill. Your users will certainly be using them.
Another tip I like to offer to developers is to take common phrases and words that you find repeated in your utterances and create a slot value for them. For example, I’m currently building a game for Alexa that presents you with three clues and you have to determine what each clue has in common. It’s a simple model, but there’s a wide range of responses that a user can provide. Most of them follow the format “they all have X.” Here’s what my utterances look like for my AnswerIntent:
You can see that there are two common slots used in all of my utterances: thisthatthey and answer. Answer is the obvious one, because it is a custom slot that has all of the answers to my puzzle questions. This custom slot includes values like “wired,” “bases,” “launched,” and “teeth” because those are all correct answers to one of my questions. I can use data from entity resolution to help me determine if the user said one of my answers, or if they completely missed it.
Thisthatthey, however, is simply a mechanism to make maintaining my list of utterances manageable. It is also a custom slot, and it contains these values: “this,” “that,” “they,” “things,” “these.” Each of those words would require me to write five utterances for each one that uses thisthatthey above. You can see how a list of twenty utterances can quickly become 100 or more. This makes my list of utterances significantly shorter, which makes adding to and maintaining this list in the future significantly easier.
As you can see, there’s plenty to think about when you are writing sample utterances for your Alexa skill. One-shot utterances are obviously very important, but more than anything, taking the time and care to think deeply about how your user will interact with your skill will make every interaction that much better.
Do you have any tips or tricks you use for your utterances? I’d love to hear about them. You can reach me on Twitter @jeffblankenburg. I’d love to continue this conversation, so please reach out!
Every month, developers can earn money for eligible skills that drive some of the highest customer engagement. Developers can increase their level of skill engagement and potentially earn more by improving their skill, building more skills, and making their skills available in in the US, UK and Germany. Learn more about our rewards program and start building today. Download our guide or watch our on-demand webinar for tips to build engaging skills.