Build a Strong Language Model to Get the Most Out of Dynamic Entities

Stef Sharp Jun 11, 2019
Share:
Tips & Tools Advanced Build
Blog_Header_Post_Img

We recently launched dynamic entities, which can help you personalize Alexa skill experiences by adapting your interaction model at run time without edits, builds, or re-certification. Given dynamic entities offer a new authority source, custom slot values, and a new directive, there's a lot to take in. But where do dynamic entities really shine and how can you best leverage them? I'm excited to share a few use cases where you can leverage dynamic entities to personalize and improve the user experience of your Alexa skill.

Building the Foundation: The Static Language Model

Before you get started with dynamic entities, you'll need a solid foundation in the form of your language model. Just simply throwing all of your slot values into dynamic entities and leaving a skeleton in your static model is going to leave you hearing error messages and failures. Your language model defines the interactions Alexa expects to fulfill with your skill and dynamic entities only exist for a short duration. If you leave your static language model empty, then there isn't much for Alexa to train against and therefore you'll hear more error instances where nothing quite matches. Let's take a look at an example.

A trivia skill has an intent to handle user guesses and answers called AnswerIntent. The intent accepts a single slot type of ANSWER. This skill surfaces new questions weekly and dynamic entities will allow me to fill in answers at run time. This is exactly what I was hoping for so I only load a single value in the ANSWER slot since I intend to update so frequently. My static catalog looks something like this:

Copied to clipboard
{
    "name": "ANSWER",
    "values": [
        {
            "name": {
                "value": "pizza"
            }
        }
    ]
}

Except in testing, none of my responses are routing to the AnswerIntent. The single value I provided is successful, but I can't consistently resolve answers for trivia questions.

Define a variety of slot values to reflect the dynamic values you are uploading. If you have a small set of sample utterances and slots values, then there isn't much for Alexa to train against and you'll see more instances of mismatched intents. My skill is a master of “pizza” but not much else.

Instead of that single value, I loaded the ANSWER slot type in my language model with three weeks worth of slot values of a variety of lengths (one word, two word, and three word) and suddenly my AnswerIntent is capturing responses more consistently.

Aim to have only one single word intent. These are intents where there is only a single slot value as the sample utterance.

Copied to clipboard
{
    "name": "AnswerOnlyIntent",
    "slots": [
        {
            "name": "answer",
            "type": "ANSWER"
        }
    ],
    "samples": [
        "{answer}"
    ]
}

Combine sessionAttributes for the the state the user is at in the skill, dynamic entities, and the intent context to determine where to route the user. For my trivia example, I allow users to select a trivia round by number that slots to the SelectRound intent. Both AnswerIntent and SelectRound intent potentially accept numbers as valid slot values. To avoid confusion with my ANSWER slot type, once a user starts a game I can store a state in the sessionAttributes to indicate a game is in progress and route number answers to the AnswerIntent handler rather than the SelectRound handler.

Use slot-based grammars where you find yourself otherwise creating similar or overlapping intents. My trivia skill allows users to select a category. Rather than have an intent for each category I can use a CATEGORY slot type to reflect the user's choice. Slot-based grammars allows me to simplify repeating phrases in my utterances such as “who,” “what,” “when,” and “where.”

For a generic example: The InformationIntent includes utterances: "{interrogative} did {event} take place?", "{interrogative} is {event}", and "{interrogative} {event}"

  1. Interrogative: "What," "Where," "Why," "How," "When"
  2. Event: "Burning Man," "The Great Depression," "Pax Romana," etc.

Static slot values customize and train a model, dynamic slot values customize a point in time interaction. Dynamic entities only exist at runtime and only for a portion of an interaction. Keeping with the trivia skill example, one week I have a category of “Trees” and want to capture “fur” as a synonym for “fir tree”. I can have that synonym only exist dynamically so that when I pivot to a “Fabrics” category the “fur” value isn't trying to route to a tree.

However, I want “fur” and “fir” to exist as values for my ANSWER slot type for future trivia interactions. Adding those values to my static model ensures the model is trained to expect these types of answers.

Use Case: Frequent Content Updates

Now that I have a good foundation in my language model to start building against, let's look at how to get the most out of dynamic entities for my trivia skill.

Fresh and frequently updated content keeps users coming back to a skill. There is both the allure of the unknown in the form of “What is coming this week?” as well as the idea that the interaction can't be exhausted in the short term. However, any change to the language model means going through certification. This ensures that the experience is consistent and reliable for users across releases, but certification also takes time. As exciting as trees are, I doubt my players really want to answer the same tree trivia questions for weeks on end.

To create a good balance of dynamic and static content for this use case:

  1. Narrow the acceptable results - My ANSWER slot type has 150 values, but my dynamic entities only use the acceptable answers for that round.

    Question: The name of the fabric chiffon comes from a French word that translates as which of the following?
    A. Cloth B. Veil C. Transparent D. Delicate

    My dynamic entities would include values for “cloth,” “veil,” “transparent,” and “delicate” for this question. Keep in mind dynamic entities only support 100 values so skills that leverage large dynamic answer sets are not good candidates for this method. An example being a slot type for Books in which over 100 book titles are acceptable answers.
     
  2. Leverage the AMAZON.Fallback intent and the static catalog for incorrect answers - In the fabric category example, my static ANSWER slot type has at least 146 values that a user could say but are incorrect (150 less the four valid answer choices). This doesn't mean load a slot with hundreds of incorrect values as you might find that intent triggering for everything, but it makes the static catalog a second source for possible utterances.

    My trivia skill right now has slot values for trees, fabric, and colors. In testing I say “banana” and the skill simply doesn't respond. Digging into the logs I notice that nothing is triggered. “Bananas” is out of the scope of my language model. If I want to catch that phrase because I expect users might say “banana” often then I could add it to the AMAZON.Fallback intent and flag the answer as incorrect.

    Combining the static slot values with the AMAZON.Fallback intent and state management can help me to keep users in the skill by handling wrong answers as something other than errors.
     
  3. Update synonyms dynamically to cover misrecognitions - We've all had it happen. A skill goes live and users start to throw all manner of phrases at it. Sometimes speech is rushed and words just don't come out right. In the short term my hands are tied, I can't update the language model and in some cases I don't want these minor errors to end up as permanent features. The solution is dynamic synonyms on slot values.

    My fabric category is live. I'm looking at the Intent History in the developer console and a lot of users are saying “would” for the AnswerIntent. A little further testing and I realize that “wool” is being misheard as “would.” I don't use utterances with “would” anywhere else in the skill and I could add “would” as a synonym for “wool” but the fabric category is only around for a short period of time and my users are affected now. I can quickly update my dynamic entities for that question to add the synonym, test in my development environment to confirm, and then quickly push the fix live. My users are getting their trivia points and, once the fashion category retires, I don't have to worry about conflicting utterances if I ever need “would” in another part of the skill.

For the frequent content updater, dynamic entities are a great way to supplement the static language model and increase the chances of matching user utterances to the intended result rather than a possible result in a sea of slightly correct answers.

Use Case: State-Specific Content

A use case that our previous blog touched on was using dynamic entities for personalization. Two forms of personalization are user specific and state specific:

  1. Narrow results to the user's preference - For my example, I have a skill with multiplayer chess as a core feature. Players can add other users to a friends list and select their preferred opponent when the EnterGameLobby intent is triggered.

    Now I certainly don't want to add all these usernames to my slot values. New accounts are created every day and there is no way I can keep up with demand. I also don't want to bulk load these users as dynamic entities. If the skill becomes popular, I will quickly pass the 100 value limit and putting the skill through certification every day is not ideal. Through combining generic slots my skill gets all usernames, but I want to surface the friends list first. This is the easy part. After a user triggers the EnterGameLobby intent to prepare to pair with another player, I can load all of the usernames in their friends list as dynamic slot values. That way if they ask to play with their friend “BlueOranges,” I won't accidentally match them with player “BlueOrange” who is valid player, but not the one they asked for.
     
  2. Categorize similar values based on the user state - State specific content means altering slot values based on where a user is at in a skill.

    Let's pivot to another genre: groceries. A user wants milk from the fictional brand Plain. Plain has a large catalog of dairy based products. I could asks for brand and item type individually. However, item type also includes other non-dairy items because I am serving more than one brand. To solve for this, when a user supplies the brand name Plain, I dynamically load the Plain catalog and treat items in the static catalog as incorrect choices. This also works in reverse where when users ask for “cheese” I only load brands that sell cheese. With this setup I have fewer false positives. Keep in mind that dynamic entities last for 30 minutes. Make sure you use the CLEAR updateBehavior to remove unneeded or conflicting dynamic values before the next part of the dialog.

There are many more ways to leverage dynamic entities with other features in Alexa to create language models and experiences that are tailored to your users and their usage. The key takeaways are to build a language model that is robust and to not look at dynamic entities as a way to replace the language model but as a way to augment it.

Related Content