Building Conversational Alexa Skills: How to Dynamically Elicit Slots Based on a Previous Answer Using Dialog Management

Justin Jeffress Nov 02, 2018

Over the past few months, I've spent a lot of time deep diving into dialog management and sharing what I've learned with the Alexa developer community. I'm often asked, "Is there away to change the slot and the question that Alexa prompts the user for next based on the previous answer?" The answer is yes!

Dialog management simplifies creating a multi-turn conversational skill. When you mark at least one of your intent's slots as required to fulfill the intent, the Alexa voice service will keep track of the dialog state and the required slots that still need to be collected. When your back end returns the Dialog.Delegate directive, the Alexa voice service will automatically prompt for the next slot. If you want to override what Alexa will prompt for next, you can do so with the Dialog.ElicitSlot directive.

For example, let's say we are making a coffee shop skill and our customer can choose between coffee and tea. When the customer chooses coffee, our skill needs to ask if they want a light, medium, medium-dark, or dark roast. If the customer wants tea, then our skill should ask the user their choice of either black, green, oolong, or white tea. It wouldn't make sense in response to the customer asking for coffee, the skill asked the customer what type of tea they want.

In this technical post, I’ll walk you through how to use dialog management to dynamically elicit slots based on a previous answer from a customer.

Building the Voice Model

For our voice model, we will create an intent called OrderIntent, which will handle the utterances for placing an order at our coffee shop.

Create Utterances

When our customer interacts with our skill, they may say the following things to tell us what they want to order.

Start my order
I'll have coffee
I want to drink tea
I want dark coffee
I want tea to drink
I want oolong tea
Green tea sounds great
tea please

Looking through the utterances above, we can identify the three slots that we need to capture what the user orders: drink, coffeRoast, and teaType.

Now that we've determined our slots, let's update our utterances and replace the values with their slots so we can capture the values in our skill's back-end code.

Start my order
I'll have {drink}
I want to drink {drink}
I want {coffeeRoast} {drink}
I want {drink} to drink
I want {teaType} {drink}
{teaType} {drink} sounds great
{drink} please

The "I want {teaType} {drink}" utterance will enable the customer to give all the information in one breath, while "I'll have {drink}" requires our skill to ask a follow-up question based on the value given for drink. We'll go over how to do that when we talk about the backend.

Define Slots

Our slot values will correspond to the options that are available for purchase from our coffee shop. The table below shows the values of each of our slots.

drink	coffeRoast	teaType
coffe	light	black
tea	medium	green
	medium dark	white
	dark	oolong

Activate Dialog Management

Once we've set up our utterances and slots, it's time to activate dialog management so Alexa will handle prompting the user for the new slots. To activate dialog management, we need to set at least one of the slots belonging to the OrderIntent to be required.

Once we mark it required, we need to provide a prompt that Alexa will say when prompting the user to fill the slot and the sample utterances the user might say to fill it. While we could technically make all three slots required and manually exit out of dialog management when we have the minimum slot values we need, which is either drink and coffeeRoast or drink and teaType, or (drink || coffeeRoast && teaType) for short, we are going to use the ElicitSlot directive to control what gets prompted for after the user makes a drink selection and we'll only mark drink as required. Our prompt will be "Which would you like coffee or tea?" and our sample utterances will be:

I'll have {drink}
I want to drink {drink}
I want {coffeeRoast} {drink}
I want {drink} to drink
I want {teaType} {drink}
{teaType} {drink} sounds great
{drink} please

Notice that the slot utterances are almost identical to the intent utterances. This affords the customer more leeway in case they give us more information that what was asked for.

Note: We didn't define "{drink}" as one of our utterances. We could have done so, but it's optional because Alexa automatically adds it to the model for us.

With our front end properly configured, let's take a look at how the back end works so we can elicit our slots based upon the value of drink.

Building Our Back End

While it might be tempting to use a flow chart to express what slot should be elicited next based on upon a previous answer, we must be careful not to turn the skill into a phone tree. Phone trees are rigid and brittle and don't often result in great conversational experiences. Instead we should use situational design to make our skill react to the user's answer based on the information that we already have collected, which is why I prefer to use expressions rather than flow charts to express what information my skill needs based on the situation. Ultimately the expression that we are trying meet is (drink || coffeeRoast && teaType). Our skill needs a value for drink and either coffeeRoast or teaType and that depends on the value of drink which is our situation.

If the value of drink is coffee then our skill needs a value for coffeeRoast. Likewise, if the drink is tea then our skill needs a value for teaType.

Define Handlers

To elict our slots based on the value of drink we will define four handlers. These handlers will represent the various situations of our skill and will elicit the appropriate slots.

Our handlers are:

StartedInProgressOrderIntentHandler
CoffeeGivenOrderIntentHandler
TeaGivenOrderIntentHandler
CompletedOrderIntentHandler

StartedInProgressOrderIntentHandler

This handler is a simple one. The canHandle function returns true if:

request.type equals IntentRequest
request.intent.name equals OrderIntent
request.dialogState is not COMPLETED

This will occur when the user says "start my order."

canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
        && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
        && handlerInput.requestEnvelope.request.dialogState !== 'COMPLETED';
},

The handle function simply returns Dialog.Delegate so Alexa will automatically prompt for the missing required slot.

handle(handlerInput) {
    return handlerInput.responseBuilder
        .addDelegateDirective()
        .getResponse();
}

Below is the whole function:

const StartedInProgressOrderIntentHandler = {
  canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
      && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
      && handlerInput.requestEnvelope.request.dialogState !== 'COMPLETED';
  },
  handle(handlerInput) {
    return handlerInput.responseBuilder
      .addDelegateDirective()
      .getResponse();
  }
}

CoffeeGivenOrderIntentHandler

Here is where things become a little more interesting. This handler will use Dialog.ElicitSlot in order to elicit the coffeRoast slot since our drink slot is coffee.

The canHandle function will return true if:

request.type equals IntentRequest
request.intent.name equals OrderIntent
request.intent.slots.drink.value is not empty
request.intent.slots.drink.value equals coffee
request.intent.slots.coffeeRoast.value is empty

The last condition is super important. Without it the skill will never stop eliciting the coffeeRoast slot. Remember the scene in "Dude Where's My Car" where they are ordering Chinese Food and the drive-thru keeps asking, "and then?". You don't want your skill to pester your customers like that.

canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
        && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
        && handlerInput.requestEnvelope.request.intent.slots.drink.value 
        && handlerInput.requestEnvelope.request.intent.slots.drink.value === 'coffee'
        && !handlerInput.requestEnvelope.request.intent.slots.coffeeRoast.value
}

Our handle function will define the speak and reprompts just as if it were returning a standard speech directive, however calling addElicitSlotDirective will cause Alexa to elicit the given slot next even if it wasn't marked required in our Voice Model.

return handlerInput.responseBuilder
    .speak('Which roast would you like light, medium, medium-dark, or dark?')
    .reprompt('Would you like a light, medium, medium-dark, or dark roast?')
    .addElicitSlotDirective('coffeeRoast')
    .getResponse();

Zoomed out the CoffeeGivenOrderIntentHandler will appear as it does below:

const CoffeeGivenOrderIntentHandler = {
  canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
      && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
      && handlerInput.requestEnvelope.request.intent.slots.drink.value 
      && handlerInput.requestEnvelope.request.intent.slots.drink.value === 'coffee'
      && !handlerInput.requestEnvelope.request.intent.slots.coffeeRoast.value
  },
  handle(handlerInput) {
    return handlerInput.responseBuilder
      .speak('Which roast would you like light, medium, medium-dark, or dark?')
      .reprompt('Would you like a light, medium, medium-dark, or dark roast?')
      .addElicitSlotDirective('coffeeRoast')
      .getResponse();
  }
}

TeaGivenOrderIntentHandler

By now you should have an understanding of how the TeaGivenOrderIntentHandler will be structured. In fact is pretty much the exact same thing as CoffeeGivenOrderIntentHandler, but we are checking if drink is tea and that teaType has no value.

request.type equals IntentRequest
request.intent.name equals OrderIntent
request.intent.slots.drink.value is not empty
request.intent.slots.drink.value equals tea
request.intent.slots.teaType.value is empty

canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
        && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
        && handlerInput.requestEnvelope.request.intent.slots.drink.value
        && handlerInput.requestEnvelope.request.intent.slots.drink.value === 'tea'
        && !handlerInput.requestEnvelope.request.intent.slots.teaType.value
  }

Now that we've identified what we can handle, just like the CoffeeGivenOrderIntentHandler we are going to use the ElicitSlot directive to tell Alexa which slot to elicit next. In this case we will elicit the teaType slot.

return handlerInput.responseBuilder
    .speak("Which would you like black, green, oolong, or white tea?")
    .reprompt("Would you like a black, green, oolong, or white tea?")
    .addElicitSlotDirective('teaType')
    .getResponse();
}

Combining the above pieces the TeaGivenOrderIntentHandler will appear as:

const TeaGivenOrderIntentHandler = {
  canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
      && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
      && handlerInput.requestEnvelope.request.intent.slots.drink.value
      && handlerInput.requestEnvelope.request.intent.slots.drink.value === 'tea'
      && !handlerInput.requestEnvelope.request.intent.slots.teaType.value
  },
  handle(handlerInput) {
    return handlerInput.responseBuilder
      .speak("Which would you like black, green, oolong, or white tea?")
      .reprompt("Would you like a black, green, oolong, or white tea?")
      .addElicitSlotDirective('teaType')
      .getResponse();
  }
}

As you can see, these handlers are very similar. If we were to add another drink like boba and we wanted to elicit the bobaType slot when the customer wants boba, we could simply change the drink value check to boba and the teaType has a value to check to check for bobaType instead, and update the handle fuction to elicit the bobaType slot.

CompletedOrderIntentHandler

Once we've collected all the necessary slots, we will need to place the customer's drink order. For our sample, we'll simply have Alexa repeat their order back. To do that we need to make a handler that runs when dialog management has completed collecting the slots.

CompletedOrderIntentHandler's canHandle function will true when:

request.type equals IntentRequest
request.intent.name equals OrderIntent
request.dialogState is COMPLETED

Translated into code:

canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === "IntentRequest"
        && handlerInput.requestEnvelope.request.intent.name === "OrderIntent"
        && handlerInput.requestEnvelope.request.dialogState === "COMPLETED";
}

Once we're done, we are going to access our slots and build up the response based upon the customer's drink selection.

const drink = handlerInput.requestEnvelope.request.intent.slots.drink.value;
const type; 

if (drink === 'coffee') {
    type = handlerInput.requestEnvelope.request.intent.slots.coffeeRoast.value;
} else if (drink === 'tea') {
    type = handlerInput.requestEnvelope.request.intent.slots.teaType.value;
} else {
    type = 'water';
}

The code above will set the type value based upon the user's drink selection. For example, if the user said, "I want coffee" and then answered "medium" for their preferred roast, then type will be set to medium.

Now that we are able to set type based upon the drink value, all we need to do now is build and return a response:

const speechText = `It looks like you want ${type} ${drink}`;
return handlerInput.responseBuilder
    .speak(speechText)
    .getResponse();

Here we are simply building a string to tell the user what they ordered and return a response with the ResponseBuilder.

Conclusion

Following these steps enables you to leverage dialog management to easily change what your skill will prompt for next based upon the value of a slot. Dialog management takes care of keeping the conversation open based upon the required slots, but it allows you to elicit any slot that belongs to the intent with ElicitSlot even if the slot isn't marked required.

Now that you've read through this post, try to think about how you can put these technique to use in your own skills. Let's continue the discussion online! You can find me on Twitter @SleepyDeveloper.

Building Conversational Alexa Skills: How to Dynamically Elicit Slots Based on a Previous Answer Using Dialog Management

Building the Voice Model

Create Utterances

Define Slots

Activate Dialog Management

Building Our Back End

Define Handlers

StartedInProgressOrderIntentHandler

CoffeeGivenOrderIntentHandler

TeaGivenOrderIntentHandler

CompletedOrderIntentHandler

Conclusion

Related Resources

Alexa Skills Kit

Resources

Alexa Voice Service

AVS Resources

Connected Devices

Agreements

Blogs

Support