Accurately understanding your customer’s utterances is extremely important, especially when they contain slots essential to your skill. We’re always updating and adding new features to the Alexa Skills Kit (ASK) to help you create more engaging voice experiences.
We added AMAZON.LITERAL to an early version of the Alexa Skills Kit to help you capture input from your customers into slots. Enhancements to the Alexa Skills Kit since the early version provide more accurate ways to capture slots than relying on AMAZON.LITERAL.
As a result, AMAZON.LITERAL is being retired. Starting October 22, 2018, new and updated skills will no longer be able to use AMAZON.LITERAL. If you have published a skill that uses AMAZON.LITERAL, customers will still be able to use it. Any updates you make to your skill after October 22 will require you to replace AMAZON.LITERAL.
Here are five techniques that you can use to replace AMAZON.LITERAL and improve skill accuracy.
The Alexa Skills Kit features a large collection of built-in slots. The roster is constantly expanding and existing slot types are regularly augmented with new values. However, there may be cases where a built-in slot doesn't include a value that you need.
Rather than sacrificing accuracy by using AMAZON.LITERAL, extend an existing slot type with the values that you need. This saves time since you don't need to create a custom slot type from scratch and you get all the speech recognition benefits of using a predefined slot type while being able to define your own custom values and synonyms using entity resolution.
When we built a sample skill called Pet Match, we anticipated that our customers would ask for mythical creatures like dragons and unicorns. Rather than create a custom slot type, we extended the AMAZON.Animal slot type with our own custom value called mythical_creatures and assigned the AMAZON.Animal slot type to our pet slot. Then we used entity resolution to map dragon, unicorn, and chimera to mythical_creatures. By extending the Amazon.Animal slot type, we’ve futured proofed our skill for when we want to recommend other animals such as cats, birds and fish.
With entity resolution, the skill is provided both the synonym and the resolved value. Let's take a look at the JSON when the user says, "I'd like a dragon."
"pet": {
"name": "pet",
"value": "dragon",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-sdk.amzn1.ask.skill....",
"status": {
"code": "ER_SUCCESS_MATCH"
},
"values": [
{
"value": {
"name": "mythical_creatures",
"id": "ebd87e55a93308fd31a7d42d54b2e0cd"
}
}
]
}
]
},
"confirmationStatus": "NONE"
}
Since our custom mapping is defined in the voice user interface and sent to our skill code, we don't need to keep a database of mythical creatures. We can easily check if the resolved value is mythical_creatures. When building our response, we can use the value and respond by saying, "I'm sorry, but I'm not qualified to match you with dragons."
While there are many built-in slot types, there are cases where you'll need to define your own custom slot type(s). Defining custom slot types allows you to capture values unique to your use case. Using entity resolution, you can add even more flexibility to your skill to handle the variations of speech without sacrificing accuracy.
The Pet Match sample skill recommends a dog based on three required custom slots: temperament, size, and energy. The slot types have been defined with the following values:
SizeType | EnergyType | TemperamentType |
---|---|---|
tiny | high | guard |
small | medium | family |
medium | low | |
large |
Defining our slot types allows us to collect spoken values and pass them to our skill. If we need a new value, we can simply add it to our slot type definition. It is also important to remember that slot types aren’t enumerations. Values that aren’t in your slot type’s list of values can fill the slot. For example, if your skill asks what size of a dog would you like and the customer says, “gigantic” even though it’s not one of your defined values the slot will be filled.
Using entity resolution with improved accuracy, we are able to add even more flexibility into our interactions.
We defined the following phrases as synonyms to tiny:
cheap to feed
teacup
pocket
yippy
carry in my purse
put in my pocket
itty bitty
Mapping these phrases to tiny, enables the customer to say, "I want dog that's cheap to feed" and fill our size slot. In this case, the value is "cheap to feed" and our resolved value will be tiny.
Read our Alexa skill teardown on understanding entity resolution with Pet Match. You’ll learn how to create custom slots and define synonyms with entity resolution to accurately handle variations of speech.
There are times when the input the skill needs to capture is indeterminate. For example, let's say you're building a skill that looks up song titles by lyrics. When the customer says something like, "What's the song that goes, 'Never gonna give you up'?" The skill then responds, "I love that one! That's ‘Never gonna give you up’ by Rick Astley”.
Since there many songs all containing an indeterminate amount of lyrics, there's no sufficient way to define them all in your voice user interface. Rather than AMAZON.LITERAL, you should use the AMAZON.SearchQuery slot.
Take a look at how to enhance speech recognition of your Alexa skill with the AMAZON.SearchQuery slot type and add it to your skill.
By their nature, conversations are dynamic. No two people speak exactly alike. When designing a skill, we try our best to build an interaction model that serves our customers’ needs. There are times, however, when our customer utters something to our skill that is either completely irrelevant or potentially matches more than one of our intents.
To solve this problem, some have resorted to using AMAZON.LITERAL, however a far more accurate solution exists, the AMAZON.FallbackIntent. It was created to solve this very problem.
Learn how to add FallbackIntent handling to your Alexa skill, and replace your custom solution.
There may be time when you want to capture word-for-word everything that was said. You can craft your utterances in a way that allows you do so.
For example, when the user interacts with Pet Match, they can say things like:
I want a large dog.
I want a small dog for my family.
I want a guard dog.
I don't want a tiny dog.
To capture the values that we need to provide a recommendation, we defined four slots: pet, size, temperament and energy. Swapping the slot values with our slots, our utterances will look something like:
I want a {size} {pet}
I want a {size} {pet} for my {temperament}
I want a {temperament} {pet}
I don't want a {size} {pet}
Now let's consider what would happen if the user said, "I don't want a tiny dog." We'd be able to capture "tiny" and "dog," but we wouldn't be able tell if they do or don't want a "tiny" dog.
There's also a chance that a family could be using the skill and they may say "we want" or "we don't like." To capture these variants, create a few new custom slot types. Let's take a look at what that would look like:
{subject} {verb} {article} {size} {pet}
{subject} {verb} {article} {size} for my {temperament}
{subject} {verb} {article} {temperament} {pet}
{subject} {verb} {adverb} {verb2} {article} {size} {pet}
Now we can capture the entire sentence within our slots. Then in our skill code we can do some complex logic to determine if the customer wants or doesn't want a tiny dog.
With these 5 essential techniques to capture slots, you should be well equipped to remove AMAZON.LITERAL and improve skill accuracy. Your customers will thank you.