Run an Automatic Speech Recognition (ASR) Evaluation

The Automatic Speech Recognition (ASR) Evaluation tool allows you to batch test audio files to measure the ASR accuracy of the skills that you've developed. With ASR, you can batch test your test sample audio utterances against ASR models and compare expected transcriptions with the actual transcriptions. The tool generates an evaluation report with accuracy metrics and pass/fail result for each test utterance, which you can use to resolve accuracy issues. You can also create test regression runs to automatically measure ASR accuracy at regular intervals or before skill updates or deployments.

This page describes how to run an ASR evaluation on an existing annotation set. If you have not already created an annotation set for ASR, see Create an Annotation Set for Automatic Speech Recognition (ASR).


You'll need the following items to run an ASR evaluation:

Run an ASR evaluation on an annotation set

To run an ASR evaluation on an annotation set

  1. Navigate to your annotation set:
    1. With your Amazon developer credentials, log in to the Alexa developer console.
    2. From the developer console, navigate to the Build tab.
    3. Under the Custom left nav tab, click Annotation Sets to display the NLU Evaluation tab, then click the ASR Evaluation tab to display your list of existing annotation sets.
  2. In the top-right corner of the page, click the Evaluate Model button:

    Evaluate Model
    Evaluate Model

    An ASR Evaluation window opens.

    Run Evaluation
    Run Evaluation
  3. On the ASR Evaluation window, from the Annotation Source drop-down list, select the annotation set that you want to evaluate.

  4. Click the Run an Evaluation button.

    The ASR Evaluation tool runs its test. When the test has completed, the ASR Evaluation window displays the Evaluation Id for the run with a link to the test report.

  5. To view the Evaluation Report for the run, click the hyperlink with the Evaluation Id.

    The Evaluation Report displays the Expected vs. Actual Transcription, the Pass/Fail rate for each utterance, and the overall pass percentage for the annotation set. The overall pass percentage is a weighted average of all the utterances in the annotation set:

    Evaluation Report
    Evaluation Report

If your Evaluation Report identifies areas where you would like to improve your skill's performance, see Improve your Automatic Speech Recognition (ASR) Test Results.