Improve Your Model with the Dialog Evaluation Tool (Beta)


If you test your Alexa Conversations skill and the conversational flow isn't as you expect, you can use the dialog evaluation tool to improve your Alexa Conversations skill model interactively. To access the dialog evaluation tool, you interact with your skill on the command line. You use the dialog command in the latest version of the Alexa Skills Kit Command-Line Interface (ASK CLI) that supports Alexa Conversations.

Advanced users can save and manually edit Alexa Conversations Description Language (ACDL) that represents the model corrections for specific conversations, and call REST APIs to test and correct the model. The following sections primarily focus on the dialog evaluation tool, but provide some information about an advanced user flow in How to test and correct your model by using REST APIs.

Overview

You create Alexa Conversations skills by providing actions, types, utterance sets, and dialogs. Dialogs bring these concepts together to create examples of conversational flows between the user and Alexa.

When you test your skill, the interactions between the user and Alexa might not always be as you expect. You can improve your Alexa Conversations skill model by using the dialog evaluation tool to do the following actions:

  • Test your dialogs and see the simulator debug output or the simulator API output so that you can understand what actions to take to resolve the issue.
  • Update dialogs turns as you test and use this information to update the model to resolve the issue.

To launch the dialog evaluation tool, you call the dialog command of the ASK CLI (askx) with two new flags, evaluate and verbose. This command opens an interactive environment in which you can simulate a multi-turn conversation with Alexa and provide corrections when Alexa makes a mistake. You provide your corrections in ACDL format. You can record the corrections you made and use them the next time you deploy your skill to improve the accuracy of the Alexa Conversations skill model.

Restrictions

The dialog evaluation tool currently has the following restrictions:

  • The dialog evaluation tool is currently only available for the command line.
  • Before you use the dialog evaluation tool, you must deploy the latest version of your skill.
  • The dialog evaluation tool currently only works for the en-US locale.
  • The dialog evaluation currently only runs on the Alexa Conversations skill model, not the custom skill interaction model.

Prerequisites

Before you use the dialog evaluation tool, you must meet the following prerequisites:

Walkthrough of a simple example

Before getting into detail about the tool, it's helpful to see an example. The following interaction is a simple, line-by-line example of what you might see on the command line when you use the dialog evaluation tool to correct a cake-ordering skill. The last Alexa turn of the following interaction is incorrect until you correct it through the dialog evaluation tool.

Original interaction

User: Open Cupcake Shop.

Alexa: What can I help you with?
User: I'd like to order a cake.

Alexa: What kind of cake are you thinking?
User: A yellow half sheet cake.

Alexa: Great. That's a half yellow cake with sheet icing. (This Alexa response doesn't make sense.)

Corrected interaction after dialog evaluation tool

User: Open Cupcake Shop.

Alexa: What can I help you with?
User: I'd like to order a cake.

Alexa: What kind of cake are you thinking?
User: A yellow half sheet cake.

Alexa: Great. That's a yellow half sheet cake. (Correct response after you correct the model by using the dialog evaluation tool.)

The following table walks you through the text that you see on the command line as you interact with the dialog evaluation tool.

Command line Description
askx dialog --locale en-US --evaluate --verbose You start the interactive skill session on a command line from within your skill project folder by using the dialog command of the Alexa Conversations version of the ASK CLI. The evaluate and verbose flags indicate that you want to use the dialog evaluation tool.
User > Open Cupcake Shop You launch the skill by typing the request you'd say if you were interacting with this skill on an Alexa-enabled device. (This comment applies to all other User > lines.)
Alexa > What can I help you with? Alexa provides a response, the same as if you were interacting with this skill on an Echo device. (This comment applies to all other Alexa > lines.)
User > I'd like to order a cake. As the skill user, you type in your request.
[acdl]: received(Invoke, "I'd like to order a cake.") The dialog evaluation tool represents the user request in ACDL format, which shows that the user request act is Invoke.
[info]: User request act is Invoke. The dialog evaluation tool explains what happened in the previous line.
Alexa > What kind of cake are you thinking? Alexa responds to the user.
[acdl]: response(generalCakeRequestApla, Request {arguments = [PlaceOrderAPI.arguments.color, PlaceOrderAPI.arguments.size, PlaceOrderAPI.arguments.icing]}) The dialog evaluation tool represents the Alexa response in ACDL format. The APLA template is generalCakeRequestApla. The response act is Request, meaning that Alexa is asking for the arguments required to invoke the API. The missing arguments that Alexa is asking for are color, size, and icing.
[info]: Alexa responds with generalCakeRequestApla to Request the arguments color, size and icing for PlaceOrderAPI. The dialog evaluation tool explains what happened in the previous line.
Do you accept this response [y/n]? y The dialog evaluation tool asks if Alexa's response sounds correct to you. You enter y.
User > A yellow half sheet cake. As the skill user, you type in your next request.
[acdl]: type U0 {Color color0 Size size0 Icing icing0}

[acdl]: u0 = received<U0>(Inform, "A {color0|yellow} {size0|half} {icing0|sheet} cake.")
The dialog evaluation tool shows the slot values in the user request.
[info]: User request act is Inform, with slots yellow as Color, half as Size, sheet as Icing The dialog evaluation tool explains what happened in the previous line.
Alexa > Great. That's a half yellow cake with sheet icing. Alexa responds incorrectly; sheet icing doesn't make sense.
[acdl]: response(confirmCakePropsApla, ConfirmArgs {arguments=[someApi.color, someApi.size, someApi.icing]}) The dialog evaluation tool represents the Alexa response in ACDL format.
[info] Alexa responds with confirmCakePropsApla to ConfirmArgs for the color, size, and icing arguments of some API. The dialog evaluation tool explains what happened in the previous line.
Do you accept this response [y/n]? n The Alexa response was incorrect, so you enter n to indicate that you want to correct it.
In correction mode: The dialog evaluation tool enters correction mode.
Prediction: type U0 {Color color0 Size size0 Icing icing0}

u0 = received<U0>(Invoke, "A {color0|yellow} {size0|half} {icing0|sheet} cake.")
The dialog evaluation tool shows the prediction that caused the response you want to correct.
Is this correct [y/n]: n You indicate that you want to correct the prediction.
Correction for type: type U1 {Color color0 Size size0}

Correction for event (press enter if no change): u0 = received<U1>(Invoke, "A {color0|yellow} {size0|half sheet} cake.")
You enter the correction in ACDL format.
---------------------------------- The dialog evaluation tool renders a line for readability.
Prediction: response(confirmCakePropsApla, ConfirmArgs {arguments=[someApi.color, someApi.size]}, surfaceForm="Great. That is a yellow half sheet cake.") The dialog evaluation tool makes the next prediction based on the correction you made.
Is this correct [y/n]: y You indicate that the new prediction is correct.
---------------------------------- The dialog evaluation tool renders a line for readability.
End of turn [y/n]: y The dialog evaluation tool asks if you want to end the turn, and therefore the correction mode for the turn. You answer yes.
Alexa > Great. That's a yellow half sheet cake. Alexa responds to the user. Now that you corrected the prediction, the response is correct.
User > .save You save the corrections to a file.
User > .quit You exit the skill session.

Dialog evaluation tool actions

The following list shows the actions you can take as you interact with the dialog evaluation tool:

Start interactive mode

To start the interactive skill session, enter the following command on a command line from within your skill project folder.

askx dialog --locale en-US --evaluate --verbose

Then, at the User > prompt, enter the following command.

Open <invocation name>

Correct a slot type error in a user turn

If the slot type is incorrect, you can enter (and assign a name to) the slot type by entering the corrected user turn, in ACDL format, after Correction:. The slot type must be one of the types you specified in the interaction model file of your skill.

Example

User > Chocolate
[acdl]: type U0 {Color color0} 
[acdl]: u0 = received<U0>(Inform, "{color0|Chocolate}")
[info]: User request act is Inform, with slot chocolate as Color
Alexa > What is the size of cake?
[acdl]: response(requestSizeApla, Request {arguments=[size]}
[info]: Alexa responds with requestSizeApla, to Request the argument size
Do you accept this response [y/n]? n
In correction mode:
   Prediction: type U0 {Color color0} 
               u0 = received<U0>(Inform, "{color0|Chocolate}")
   Is this correct [y/n]: n
   Correction for type: type U1 {Icing icing0}
   Correction for event (press enter if no change): u0 = received<U1>(Inform, "{icing0|Chocolate}")
   ----------------------------------
   Prediction: response(requestSizeApla, Request {arguments=[size]}
   Is this correct [y/n]: y
   ----------------------------------
   End of turn [y/n]: y
Alexa > What is the size of cake?

Correct a request act error in a user turn

You can enter the correct request act for the user utterance by entering the corrected user turn, in ACDL format, after Correction:.

The request act can be Affirm, Deny, Inform, or Invoke. For details about all request acts, see Request Acts in the Alexa Conversations Core Library.

Example

User > Wait, can it feed ten people?
[acdl]: received(Inform, "can it feed ten people")
[info]: User request act is Inform
Alexa > A half sheet cake feeds about 20 people.
[acdl]: portionMsg0 = GetPortionAPI(size0)
[info]: Alexa calls GetPortionAPI with argument size0
[acdl]: response(getCakePortionApla, Notify {action=GetPortionAPI, success=true}, surfaceform="A half sheet cake feeds about 20 people.")
[info]: Alexa responds with getCakePortionApla to Notify that the call to GetPortionAPI was successful
Do you accept this response [y/n]? n
In correction mode:
  Prediction: received(Inform, "can it feed ten people")
  Is this correct [y/n]: n
  Correction for type: type U1 {PartySize partySize0}
  Correction for event (press enter if no change): u1 = received<U1>(Invoke, "can it feed {partySize0|ten} people")
  ----------------------------------
  Prediction: portionMsg0 = GetPortionAPI(size0, partySize0)
  Is this correct [y/n]: y
  ----------------------------------
  End of turn [y/n]: n
  Correction: response(getCakePortionApla, Notify {action=GetPortionAPI, success=true}, payload=portionMsg0)
  ----------------------------------
  End of turn [y/n]: y
Alexa > Yes, a half sheet cake feeds 10 people.

Correct or add an API in an Alexa turn

You can correct or add an API by entering the corrected API that Alexa should call, in ACDL format, after Correction:.

Example

User > Wait, how many people can that feed?
[acdl]: received(Invoke, "how many people can that feed")
[info]: User request act is Invoke
Alexa > Sorry, I don't understand
[acdl]: response(AlexaConversationsOutOfDomain, OutOfDomain {})
[info]: Alexa responds with AlexaConversationsOutOfDomain to let the user know their request is not supported by the skill
Do you accept this response [y/n]? n
In correction mode:
  Prediction: received(Invoke, "how many people can that feed")
  Is this correct [y/n]: y
  ----------------------------------
  Prediction: response(AlexaConversationsOutOfDomain, OutOfDomain {})
  Is this correct [y/n]: n
  Correction: portionMsg0 = GetPortionAPI(size0)
  ----------------------------------
  End of turn [y/n]: n
  Correction: response(getCakePortionApla, Notify {action=GetPortionAPI, success=true}, payload=portionMsg0)
  ----------------------------------
  End of turn [y/n]: y
Alexa > A half sheet cake feeds about 20 people.

Correct input arguments to the API in an Alexa turn

You can correct an input argument to an API by entering the corrected Alexa (response) turn, in ACDL format, after Correction:.

Example

User > Does a large cake feed ten people?
[acdl]: type U0 {Size size0 PartySize partySize0}
[acdl]: u0 = received<U0>(Invoke, "Does a {size0|large} pizza feed {partySize0|ten} people")
[info]: User request act is Invoke, with slots large as Size, ten as PartySize
Alexa > A large cake feeds about 20 people
[acdl]: portionMsg0 = GetPortionAPI(size0)
[info]: Alexa calls GetPortionAPI with argument size0
[acdl]: response(getCakePortionApla, Notify {action=GetPortionAPI, success=true}, payload=portionMsg0)
[info]: Alexa responds with getCakePortionApla to Notify that the call to GetPortionAPI was successful
Do you accept this response [y/n]? n
In correction mode:
  Prediction: type U0 {Size size0 PartySize partySize0}
              u0 = received<U0>(Invoke, "Does a {size0|large} cake feed {partySize0|ten} people")
  Is this correct [y/n]: y
  ----------------------------------
  Prediction: portionMsg0 = GetPortionAPI(size0)
  Is this correct [y/n]: n
  Correction: portionMsg0 = GetPortionAPI(size0, partySize0)
  ----------------------------------
  End of turn [y/n]: n
  Correction: response(getCakePortionApla, Notify {action=GetPortionAPI, success=true}, payload=portionMsg0)
  ----------------------------------
  End of turn [y/n]: y
Alexa > Yes, a large cake feeds more than 10 people.

Correct Alexa's response

You can change a response to an API call or to another response. You make this change by entering the corrected Alexa (response) turn, in ACDL format, after Correction:. The response should contain a combination of an APL template and a response act that the skill already specifies.

Example
The following example shows how to change a response to an API call.

Prediction: response(AlexaConversationsOutOfDomain, OutOfDomain {})
Correction: response0 = AddCustomPizzaApi(size0, crust0, toppingsList0, cheese0)

Example
The following example shows how to change the response to another response.

Prediction: response(AlexaConversationsOutOfDomain, OutOfDomain {})
Correction: response(requestToppingsApla, Request {arguments=[toppings]})

End the current turn

To end the turn explicitly, enter .endTurn.

Example

Prediction: response(requestSizeApla, Request {arguments=[size]}
Is this correct [y/n]: n
Correction: .endTurn

Save the corrections to a file

To save the corrections to a file, enter .save. To see the format of the corrections within the file, see Interaction block format.

Example

Alexa > I have placed your order.
User > .save

Check what variables you created

To get a list of the variables you've created as you've made model corrections in this session, enter .vars.

Example

Alexa > I have placed your order.
User > .vars

Exit interactive mode

To exit interactive mode and end the skill session, enter .quit or ctrl + c.

Example

Alexa > I have placed your order.
User  > .quit

Example of a complete dialog evaluation session

The following example shows how you might interact with the dialog evaluation tool to improve the model for a pizza-ordering skill.

========================================== Welcome to ASK Dialog =====================================================================
============= In interactive mode, type your utterance text onto the console and hit enter ===========================================
============= Alexa will then evaluate your input and give a response! ===============================================================
============= Use ".save" to save list of utterances to a file. ===================
============= Use ".vars" to check what variables have been created ==================================================================
============= Use ".endTurn" to explicitly end the current turn ======================================================================
============= You can exit the interactive mode by entering ".quit" or "ctrl + c". ===================================================

User  > open pizzabot
Alexa > Welcome! what do you want to order?

User  > i want a small pizza
[acdl]: type U0 {Size size0}
[acdl]: u0 = received<U0>(Invoke, "i want a {size0|small} pizza")
[info]: User request act is Invoke, with slot small as Size
Alexa > What kind of crust would you like?
[acdl]: response(requestCrustResponseApla, Request {arguments = [crust]})
[info]: Alexa responds with requestCrustResponseApla, to Request the argument crust
Do you accept this response [y/n]? y

User  > deep dish with light cheese
[acdl]: type U1 {Crust crust0}
[acdl]: u1 = received<U1>(Inform, "{crust0|deep dish} with light cheese")
[info]: User request act is Inform, with slot deep dish as Crust
Alexa > Sorry I don't understand
[acdl]: response(AlexaConversationsOutOfDomain, OutOfDomain {})
[info]: Alexa responds with AlexaConversationsOutOfDomain to let the user know their request is not supported by the skill
Do you accept this response [y/n]? n
In correction mode:
  Prediction: type U1 {Crust crust0}
              u1 = received<U1>(Inform, "{crust0|deep dish} with light cheese")
  Is this correct [y/n]: n
  Correction for type: type U2 {Crust crust0 Cheese cheese0}
  Correction for event (press enter if no change): u1 = received<U2>(Inform, "{crust0|deep dish} with {cheese0|light cheese}")
  ----------------------------------
  Prediction: response(requestToppingsApla, Request {arguments=[toppings]})
  Is this correct [y/n]: y
  ----------------------------------
  End of turn [y/n]: y
Alexa > What toppings do you want? I can take up to 5 toppings.

User  > mushrooms and green peppers
[acdl]: type U3 {List<Topping> toppingsList0}
[acdl]: u2 = received<U3>(Inform, "{toppingList0|mushrooms} and {toppingList0|green peppers}")
[info] User request act is Inform, with slots mushrooms and green peppers as Topping
Alexa > What size do you want?
[acdl]: response(requestSizeResponseApla, Request {arguments = [size]})
[info]: Alexa responds with requestCrustResponseApla, to Request argument crust
Do you accept this response [y/n]? n
In correction mode:
  Prediction: type U3 {List<Topping> toppingsList0}
              u2 = received<U3>(Inform, "{toppingList0|mushrooms} and {toppingList0|green peppers}")
  Is this correct [y/n]: y
----------------------------------
  Prediction: response(requestSizeResponseApla, Request {arguments = [size]})
  Is this correct [y/n]: n
  Correction: response0 = AddCustomPizzaApi(size0, crust0, toppingsList0, cheese0)
----------------------------------
  Prediction: response(orderPlacedApla, Notify {action=AddCustomPizzaApi, success=true}, payload=response0)
  Is this correct [y/n]: y
----------------------------------
  End of turn [y/n]: y
Alexa > I have placed your order.

User  > .save
User  > .quit

================================= Goodbye! =========================================

Interaction block format

When you use the dialog evaluation tool and make corrections to your model, you can save the corrections to a file and edit the file manually later. The file contains an interaction block, which provides feedback to the model for a particular instance of a conversation. The interaction block can include zero or more correction blocks, depending on how many turns had mistakes that you corrected.

A correction block has an actual section (which is what the model predicted) and an expected section (which is what the model should have predicted). Alexa Conversations uses the interaction block to represent the corrections that you provide.

The interaction blocks can be in their own ACDL file, or you can add them to other ACDL files.

The following example shows the format of an interaction block.

namespace test.interactions

import com.amazon.alexa.ask.conversations.*
import com.amazon.alexa.schema.*
import com.amazon.ask.types.builtins.AMAZON.*
import test.pizzaOrderingExampleSkill.*
import prompts.*
import slotTypes.*

type U0 {
  Size size0
}

type U1 {
  Crust crust0
}

type U2 {
  Crust crust0
  Cheese cheese0
}

type U3 {
  List<Topping> toppingsList0
}

interaction {
  u0 = received<U0>(Invoke, "i want a {size0|small} pizza")
  response(requestCrustResponseApla, Request {arguments = [crust]})

  actual {
     u1 = received<U1>(Inform, "{crust0|deep dish} with light cheese")
     response(AlexaConversationsOutOfDomain, OutOfDomain {})
  }
  expected {
     u1 = received<U2>(Inform, "{crust0|deep dish} with {cheese0|light cheese}")
     response(requestToppingsApla, Request {arguments=[toppings]})
  }

  u3 = received<U3>(Inform, "{toppingList0|mushrooms} and {toppingList0|green peppers}")
  actual {
     response(requestSizeResponseApla, Request {arguments = [size]})
  }
  expected {
     response0 = AddCustomPizzaApi(size0, crust0, toppingsList0, cheese0)
     response(orderPlacedApla, Notify {action=AddCustomPizzaApi, success=true}, payload=response0)
  }
}

How to test and correct your model by using REST APIs

If you're an advanced user, you might want to test and correct the model by calling REST APIs instead of by using the dialog evaluation tool. The workflow for directly calling the APIs is as follows:

  1. You make a Skill Simulation POST Request to send the utterance to the skill by using the input parameter. This call returns an id in the response.
  2. You make a Skill Simulation GET Request to get the model prediction output for the simulation ID that the previous call returned.
  3. If you don't agree with what the model predicted, you provide the corrected actions and corrected utterance set by making a Modify Turn PUT Request to update actions.
  4. You make a Skill Simulation POST Request to send the same utterance to your skill again.
  5. You make a Skill Simulation GET Request to get the modified model prediction output for the simulation ID that the previous call returned.
  6. You put the corresponding corrections in your ACDL interaction block.
  7. You repeat the previous steps until the dialog flow is complete.
  8. You compile the ACDL files.
  9. You deploy the changes with askx, which calls the skill package service with the model that now includes the corrections.

Modify turn request

Corrects the model prediction that you received.

Request

HTTP method and URI path

PUT /v1/skills/skillId/stages/<stage>/locales/<locale>/conversations/turnPredictions

Request parameters

Field Description Type Required

skillId

The unique ID of your Alexa Conversations skill.

String

Yes

stage

The state of the skill. Allowed values are development and live (case sensitive).

String

Yes

locale

The locale that this API supports. Currently, the only valid value is en-US.

String

Yes

Response

A successful request returns 201 SUCCESS, which means that you accepted the correction.

Errors

Code Description

400 BAD REQUEST

The provided payload is invalid.

401 NOT AUTHORIZED

The access token is invalid, expired, or doesn't have the appropriate permissions.

403 FORBIDDEN

The caller doesn't have permissions for this vendor.

404 NOT FOUND

Vendor not found.

429 TOO MANY REQUESTS

Too many requests received.

500 INTERNAL SERVER ERROR

Service is unavailable or encountered a problem.


Was this page helpful?

Last updated: Nov 27, 2023