Key takeaways
You can improve your existing custom skill’s customer experience right now using a text-based large language model (LLM) of your choice. Use an LLM to improve accuracy by generating a robust set of utterances to add to your intents.
In this article:
You can use large language models (LLMs) to improve your existing custom skill.
This article will help you use an LLM to improve your skill’s accuracy by generating more utterances to add to the intents in your IM, so customers can speak more naturally to your skill and be understood.
Learn how to create your Interaction Model for your skill.
Will Built-in intents or slot types satisfy your needs? Check the Built-in Intent Library and the Slot Type Reference.
When you create a new custom intent, you provide a name and a list of utterances that customers might say to hear the response you’ve written for that intent. You can start by just writing out the full phrases, and then identify the slots within the phrases later.
Some of the most common sources of errors and customer frustration are caused by insufficient utterances included in the Interaction Model that match to an intent because …
These IM “misses” might cause Alexa to tell the customer she can’t fulfill their request or didn’t find what they were looking for when the customer knows the skill should be capable of it, or does, in fact, have that content. That’s a Trustbuster. These misses may also cause Alexa to default to fallback responses and errors that don’t make any sense, or even ruin the experience of a competitive game. How disappointing for a customer to answer a tough trivia question correctly, only to have Alexa tell them their answer is incorrect because the skill didn’t have “<what the customer said>” as an accepted response in the Interaction Model! For example:
Avoid
Customer: Alexa, ask Seattle Super Trivia to play today’s game.
Alexa: Welcome back to Seattle Super Trivia. Get on your rain boots. Here’s the first question of the day. Seattle gets an average of how much rain per year? Is it A. About 20 inches, B. About 40 inches, or is it C. About 100 inches?
Customer: I think it’s 40.
Alexa: <fail sound fx> Ouch. Not quite. The answer was B. About 40 inches. Let’s see if the rain lets up on the next one.
Customer: Alexa, stop!
Our interaction model for this hypothetical skill didn’t support the customer answering “I think it’s 40.” Remember, customers will speak naturally to answer this question, such as saying “I think it’s…” and use partial phrases rather than the full phrases your skill (and maybe even your display) used, such as saying simply “40.”
An LLM can help with this problem. You can use it to help you generate additional utterances and better anticipate what customers might say to your skill. You can do this by using a general prompt, or with a prompt that asks for variations on some sample phrases you specify.
If we prompt an LLM for something like the following example, we can get a good starter set of utterances, or catch utterances we may have missed. You probably won’t want to use the whole list, but select the ones that sound most natural, and re-generate the list by requesting variations of those:
Prompt
What are the different ways a person might state the correct answer to the following trivia question: “Seattle gets an average of how much rain per year? Is it A. About 20 inches, B. About 40 inches, or is it C. About 100 inches?” Give me 100 examples using 10 words or fewer each to choose B.
Selected output
We can use a few high-quality phrases we generated above to further prompt for even more variation. For example:
Prompt
Give me 100 variations someone might say on the following statements, using 10 words or fewer. [Paste your list from the above step]
Selected output
Learn more about Best Practices for Sample Utterances and Custom Slot Type Values so you can select the phrases that will yield further high-quality responses. You might want to conduct a survey or interview with multiple people to ask what they might say, so your examples included in your LLM prompt begin with the most natural-sounding responses.
Keep in mind that when Alexa is listening to the customer, the mic will stay open for their response for eight seconds. You might want to specify in your prompt a maximum number of words, as we did above.
Double check that you’ve considered in your variations some of these natural speech patterns:
You may want to include some of these examples in your prompt. For example:
Prompt
What are 100 different ways a person might say the following phrases, using less than 10 words: please tell me, tell me about, does it, do they, can you, give me, I want, do you have
Selected output
We can add the above outputs to selected phrases to add to the intents of our interaction model. Let’s say we’re creating a skill to help customers keep track of the health of their houseplants. Some things the customer might say that we’ll want to support, informed by the above activity, might include:
Keep in mind, however, adding vastly (hundreds to thousands) more utterances to your custom skill model isn’t always better. You (a human) must still ensure there aren’t overlapping utterances in one or more intents, and review every utterance generated by the LLM before you add them. Finally, once you’ve added your utterances, one intent at a time, test your skill on a device to ensure the accuracy or expected outcome doesn’t degrade.