In a previous post, we detailed how a skill called Pet Match uses dialog management to handle a multi-turn sequence to collect slots from the user. Here we explain how Pet Match uses entity resolution to strengthen the interaction model to understand more complicated responses from the user. We recently covered both of these concepts in a webinar on advanced voice design techniques, during which we shared some best practices for applying advanced features to enable customers to engage in multi-turn conversations with Alexa skills. Check out the webinar recording and download the source code to dive deep on dialog management and entity resolution.
Today, we'll focus on Entity Resolution, which enables you to add synonyms to your slot values and validate that a user actually said one of them. The Alexa service will then handle resolving the synonyms to your slot values. This simplifies your code since you don't have to write any code to match the synonym that user said to your slot value.
Pet Match finds the perfect pet for the user by asking a series of questions designed to fill the size, temperament, and energy slots. Pet Match only supports dogs, but the same principles can be applied to match the user with cats, birds, turtles, books, movies, video games or whatever you'd like. Once the three required slots are collected, the values are passed to an API through an http get request. The API returns the match in a JSON payload, which the skill unpacks and converts into speech output.
Pet Match's size slot has 4 values: tiny, small, medium, and large. But what happens if the user says huge? We're going to miss their size preference. We could add huge to our slot values, but our Pet Match API only supports the original 4 values. If we pass the API huge it won't find a match. We could update the Pet Match API to support huge, but that won't scale well. Also consider what happens if we don't own the API and can't modify it?
Furthermore, API's are not designed for personal conversation. They are designed for communication between computer systems. For example, an API that provides Major League Baseball box scores may require a three-letter code instead of the full team name. It would be unnatural to force the user to say the three-letter code so we could map the team name as synonym to the 3 letter code so the user can say the team name, but our backend would use the three-letter code to make the API call. For Pet Match we can map huge to large. That way we can still pass large to the Pet Match API while still capturing the synonym.
While it's possible to create our own syononym maps in our back-end code, the Alexa Skills Kit (ASK) provides everything we need to resolve synonyms with Entity Resolution. Take a look at Pet Match's Interaction Model and look for sizeType.
{
"name": "sizeType",
"values": [
{
"id": null,
"name": {
"value": "large",
"synonyms": [
"huge",
"truck",
"gigantic",
"eat me out of house",
"scary big",
"ginormous",
"ride",
"waist height"
]
}
},
...
}
As you can see for each value we have a list of synonyms. The large slot value has been mapped to huge so if the user says huge the value will resolve to large. Entity Resolution is not limited to single word phrases. You're able to combine several words to make more complex synonyms such as, "eat me out of the house" and "scary big." This allows the user to speak with your skill in a more natural way. If the user says, "I want a dog that will eat me out of house" the skill will resolve the value to large and pass it to the Pet Match API. Let's take a look at the JSON that is sent to our backend when we have a match. In this case the user said, "eat me out of house."
"size": {
"name": "size",
"value": "eat me out of house",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-sdk.[skill-id].sizeType",
"status": {
"code": "ER_SUCCESS_MATCH"
},
"values": [
{
"value": {
"name": "large",
"id": "afacdb0a401ccdf6b48551bbc00e8a74"
}
}
]
}
]
},
...
The value "eat me out of house" is the synonym and the value or values it resolved to are contained in the resolutions object. Upon a match we get an ER_SUCCESS_MATCH status code. To access the resolved value programmatically you can call this.request``.event.intent.slots.size.resolutions.resolutionsPerAuthority[0].values[0].value.name.
The object is pretty complex and Pet Match really only needs the synonym and the resolved value. To simplify the object, use the getSlotValues function. It will return a simplified object of the form:
{
"SlotName": {
"synonym": '',
"resolved": '',
"isValidated": false
},
...
}
SlotName is the name of the slot, synonym is what the user said, resolved is the value that synonym resolved to, and isValidated is true when the status is ER_SUCCESS_MATCH. For example, if the user filled the energy slot with "play fetch with," the resulting object would look like:
{
...
"energy": {
"synonym": "play fetch with",
"resolved": "high",
"isValidated": true
},
...
}
In code we could access the synonym and resolved values by simplying doing:
let isTestingWithSimulator = false;
let filledSlots = delegateSlotCollection.call(this, isTestingWithSimulator);
let slotValues = getSlotValues(filledSlots);
console.log('Energy - Synonym: "', slotValues.energy.synonym, '" Resolved: "', slotValues.energy.resolved,'"');
For Pet Match, we are resolving the slot values down to their resolved values and sending to the Pet Match API through and HTTP GET request in order to perform the match and we leveraged Entity Resolution. Doing so allows us to easily map the synonym to the API Key value and provide a more natural interaction with the skill since user can say things like plays tug of war and will resolve to medium.
Pet Match also combines Dialog Management and Entity Resolution to disambiguate synonyms that have resolved to more than one value. Pet Match's size slot has 4 values: tiny, small, medium, and large.
The synonym "little" has been mapped to both tiny and small. If the user says, "I want a little dog" the JSON sent to our skill would look like:
"size": {
"name": "size",
"value": "little",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-sdk.[skill-id].sizeType",
"status": {
"code": "ER_SUCCESS_MATCH"
},
"values": [
{
"value": {
"name": "small",
"id": "eb5c1399a871211c7e7ed732d15e3a8b"
},
"value": {
"name": "tiny",
"id": "d60cadf1a41c651e1f0ade50136bad43"
}
}
]
}
]
},
...
In this case we still get an ER_SUCCESS_MATCH but our values array now has two items in it. We can detect if a synonym has resolved to more than value by check to see if the size is greater than 1:
if (this.size.resolutions.resolutionsPerAuthority[0].values.length > 1) {
// then we need to disambiguate.
}
If the array length is greater than 1, Pet Match disambiguates the slot by using Dialog Management to re-elicit the slot. Let's take a look at the disambiguateSlot function:
function disambiguateSlot() {
let currentIntent = this.event.request.intent;
Object.keys(this.event.request.intent.slots).forEach(function(slotName) {
let currentSlot = this.event.request.intent.slots[slotName];
let slotValue = slotHasValue(this.event.request, currentSlot.name);
if (currentSlot.confirmationStatus !== 'CONFIRMED' &&
currentSlot.resolutions &&
currentSlot.resolutions.resolutionsPerAuthority[0]) {
if (currentSlot.resolutions.resolutionsPerAuthority[0].status.code == 'ER_SUCCESS_MATCH') {
// if there's more than one value that means we have a synonym that
// mapped to more than one value. So we need to ask the user for
// clarification. For example if the user said "mini dog", and
// "mini" is a synonym for both "small" and "tiny" then ask "Did you
// want a small or tiny dog?" to get the user to tell you
// specifically what type mini dog (small mini or tiny mini).
if ( currentSlot.resolutions.resolutionsPerAuthority[0].values.length > 1) {
let prompt = 'Which would you like';
let size = currentSlot.resolutions.resolutionsPerAuthority[0].values.length;
currentSlot.resolutions.resolutionsPerAuthority[0].values.forEach(function(element, index, arr) {
prompt += ` ${(index == size -1) ? ' or' : ' '} ${element.value.name}`;
});
prompt += '?';
let reprompt = prompt;
// In this case we need to disambiguate the value that they
// provided to us because it resolved to more than one thing so
// we build up our prompts and then emit elicitSlot.
this.emit(':elicitSlot', currentSlot.name, prompt, reprompt);
}
} else if (currentSlot.resolutions.resolutionsPerAuthority[0].status.code == 'ER_SUCCESS_NO_MATCH') {
// Here is where you'll want to add instrumentation to your code
// so you can capture synonyms that you haven't defined.
console.log("NO MATCH FOR: ", currentSlot.name, " value: ", currentSlot.value);
if (REQUIRED_SLOTS.indexOf(currentSlot.name) > -1) {
let prompt = "What " + currentSlot.name + " are you looking for";
this.emit(':elicitSlot', currentSlot.name, prompt, prompt);
}
}
}
}, this);
}
The function loops through all the slots and checks to see if it needs to be disambiguated. If so, builds up the prompt:
let prompt = 'Which would you like';
let size = currentSlot.resolutions.resolutionsPerAuthority[0].values.length;
currentSlot.resolutions.resolutionsPerAuthority[0].values.forEach(function(element, index, arr) {
prompt += ` ${(index == size -1) ? ' or' : ' '} ${element.value.name}`;
});
prompt += '?';
let reprompt = prompt;
If the user said "little" for the size slot, the above snippet will create a prompt and a reprompt that says, "Which would like tiny or small?" To have Dialog Management reprompt the user for the slot we then need to emit :elicitSlot with this.emit``(':``elicitSlot``', currentSlot.name, prompt, ``reprompt``); where currentSlot.name is size.
There are times when the user may give an answer that doesn't fit with your paradigm. For example, what if the user asks Pet Match for a dragon or a unicorn? As much I would love to own and care for a dragon, sadly they don't exist. With Entity Resolution we can add a mythical_creatures slot value to the pet slot type and add all the mythical creatures that we want to capture as synonyms. After adding mythical creatures to the petType, the JSON should appear as below:
...
{
"name": "petType",
"values": [
{
"id": null,
"name": {
"value": "dog",
"synonyms": [
"puppy",
"doggie",
"canine",
"canis familiaris",
"canis"
]
}
},
{
"id": null,
"name": {
"value": "mythical_creatures",
"synonyms": [
"dragon",
"unicorn"
]
}
}
]
},
...
In the Pet Match code, we check slotValues.pet.resolved and if it's equal to mythical_creatures then we stop Dialog Management and return a random funny response for example, "Ah yes dragons are majectic creatures, however owning one is outlawed." This adds character to our skill. The user can have fun interacting with the skill to see how many different ways the skill will respond. In the case where we get a mythical_creature we don't even call the Pet Match API because we know it's something that the API can't match.
One thing to note is that synonyms are not enumerations. There are cases where the user may say something that's not a synonym but it still resolves. This is a low confidence match because it's not in the list of synonyms. In this case, the status code returned is ER_SUCCESS_NO_MATCH. For example, what if the user said "pizza" for size. The JSON sent to our skill's service would look like:
...
"energy": {
"name": "size",
"value": "pizza",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-sdk.amzn1.ask.skill.[skill-id].energyType",
"status": {
"code": "ER_SUCCESS_NO_MATCH"
}
}
]
},
"confirmationStatus": "NONE"
}
...
Pizza isn't a valid size so we should ignore it and re-elicit the slot. One thing that you may want to consider is to upon an ER_SUCCESS_NO_MATCH capture the slot name, and value so you can capture potential new synonyms. For example, what if the user says "itty bitty" for size?
...
"energy": {
"name": "size",
"value": "itty bitty",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "amzn1.er-authority.echo-sdk.amzn1.ask.skill.[skill-id].energyType",
"status": {
"code": "ER_SUCCESS_NO_MATCH"
}
}
]
},
"confirmationStatus": "NONE"
}
...
In this case, we want to know that the user said "itty bitty" because it's valid so we should update the Interaction Model so the next time a user says "itty bitty" your skill will understand. The disambiguate slot function does a check for ER_SUCCESS_NO_MATCH.
else if (currentSlot.resolutions.resolutionsPerAuthority[0].status.code == 'ER_SUCCESS_NO_MATCH') {
// Here is where you'll want to add instrumentation to your code
// so you can capture synonyms that you haven't defined.
console.log("NO MATCH FOR: ", currentSlot.name, " value: ", currentSlot.value);
if (REQUIRED_SLOTS.indexOf(currentSlot.name) > -1) {
let prompt = "What " + currentSlot.name + " are you looking for";
this.emit(':elicitSlot', currentSlot.name, prompt, prompt);
}
}
Notice that after logging the no match, the slot is re-elicited if it is a required slot.
Pet Match leverages Entity Resolution to map synonyms in three specific ways:
Here are some additional resources to help you as you start using entity resolution to build more engaging skills:
Every month, developers can earn money for eligible skills that drive some of the highest customer engagement. Developers can increase their level of skill engagement and potentially earn more by improving their skill, building more skills, and making their skills available in in the US, UK and Germany. Learn more about our rewards program and start building today.