Run an Automatic Speech Recognition (ASR) Evaluation

The Automatic Speech Recognition (ASR) Evaluation tool allows you to batch test audio files to measure the ASR accuracy of the skills that you've developed. With ASR, you can batch test your test sample audio utterances against ASR models and compare expected transcriptions with the actual transcriptions. The tool generates an evaluation report with accuracy metrics and pass/fail result for each test utterance, which you can use to resolve accuracy issues. You can also create test regression runs to automatically measure ASR accuracy at regular intervals or before skill updates or deployments.

This page describes how to run an ASR evaluation on an existing annotation set. If you have not already created an annotation set for ASR, see Create an Annotation Set for Automatic Speech Recognition (ASR).

Prerequisites

You'll need the following items to run an ASR evaluation:

Run an ASR evaluation on an annotation set

To run an ASR evaluation on an annotation set

  1. Navigate to your annotation set:
    1. With your Amazon developer credentials, log in to the Alexa developer console.
    2. From the developer console, navigate to the Build tab.
    3. Under the Custom left nav tab, click Annotation Sets to display the NLU Evaluation tab, then click the ASR Evaluation tab to display your list of existing annotation sets.
  2. In the top-right corner of the page, click the Evaluate Model button:

    Evaluate Model
    Evaluate Model

    An ASR Evaluation window opens.

    Run Evaluation
    Run Evaluation
  3. On the ASR Evaluation window, from the Annotation Source drop-down list, select the annotation set that you want to evaluate.

  4. Click the Run an Evaluation button.

    The ASR Evaluation tool runs its test. When the test has completed, the ASR Evaluation window displays the Evaluation Id for the run with a link to the test report.

  5. To view the Evaluation Report for the run, click the hyperlink with the Evaluation Id.

    The Evaluation Report displays the Expected vs. Actual Transcription, the Pass/Fail rate for each utterance, and the overall pass percentage for the annotation set. The overall pass percentage is a weighted average of all the utterances in the annotation set:

    Evaluation Report
    Evaluation Report

Export ASR results to the Natural Language Understanding (NLU) evaluation tool

Although the ASR Evaluation Tool measures speech recognition and transcription accuracy for Alexa, the NLU Evaluation Tool measures natural language understanding accuracy. After you have your transcriptions from the ASR Evaluation Tool, you can use the NLU evaluation tool to see how accurately are those transcriptions map to intents and slots. Using these two tools together gives you an end-to-end holistic view of of the accuracy of your skill's interaction model.

To export ASR results to the NLU evaluation tool

  1. On the ASR Evaluation Report page, from the Select export location menu, choose the NLU annotation set that corresponds to the ASR results that you want to evaluate.

    Select NLU annotation set
    Select NLU annotation set
  2. Click the Export and go button.

    The NLU evaluation tool appends the actual transcriptions of the ASR evaluation to a new or existing NLU annotation set and displays a "Success" notification.

    Annotation set exported
    Annotation set exported

    Next, use the NLU evaluation tool to compare the actual transcriptions from the ASR evaluation to the skill model's expected mapping to intents and slots. See Batch Test Your Natural Language Understanding (NLU) Model.

Next steps

If your Evaluation Report identifies areas where you would like to improve your skill's performance, see Improve your Automatic Speech Recognition (ASR) Test Results.