Run an Automatic Speech Recognition (ASR) Evaluation
The Automatic Speech Recognition (ASR) Evaluation tool allows you to batch test audio files to measure the ASR accuracy of the skills that you've developed. With ASR, you can batch test your test sample audio utterances against ASR models and compare expected transcriptions with the actual transcriptions. The tool generates an evaluation report with accuracy metrics and pass/fail result for each test utterance, which you can use to resolve accuracy issues. You can also create test regression runs to automatically measure ASR accuracy at regular intervals or before skill updates or deployments.
This page describes how to run an ASR evaluation on an existing annotation set. If you have not already created an annotation set for ASR, see Create an Annotation Set for Automatic Speech Recognition (ASR).
You'll need the following items to run an ASR evaluation:
- An Amazon developer account. See developer.amazon.com to create your account, if necessary.
- An existing annotation set of audio utterances. See Create an Annotation Set for Automatic Speech Recognition (ASR) for instructions on creating an annotation set.
Run an ASR evaluation on an annotation set
To run an ASR evaluation on an annotation set
- Navigate to your annotation set:
- With your Amazon developer credentials, log in to the Alexa developer console.
- From the developer console, navigate to the Build tab.
- Under the Custom left nav tab, click Annotation Sets to display the NLU Evaluation tab, then click the ASR Evaluation tab to display your list of existing annotation sets.
In the top-right corner of the page, click the Evaluate Model button:
An ASR Evaluation window opens.
On the ASR Evaluation window, from the Annotation Source drop-down list, select the annotation set that you want to evaluate.
Click the Run an Evaluation button.
The ASR Evaluation tool runs its test. When the test has completed, the ASR Evaluation window displays the Evaluation Id for the run with a link to the test report.
To view the Evaluation Report for the run, click the hyperlink with the Evaluation Id.
The Evaluation Report displays the Expected vs. Actual Transcription, the Pass/Fail rate for each utterance, and the overall pass percentage for the annotation set. The overall pass percentage is a weighted average of all the utterances in the annotation set:
Export ASR results to the Natural Language Understanding (NLU) evaluation tool
Although the ASR Evaluation Tool measures speech recognition and transcription accuracy for Alexa, the NLU Evaluation Tool measures natural language understanding accuracy. After you have your transcriptions from the ASR Evaluation Tool, you can use the NLU evaluation tool to see how accurately are those transcriptions map to intents and slots. Using these two tools together gives you an end-to-end holistic view of of the accuracy of your skill's interaction model.
To export ASR results to the NLU evaluation tool
On the ASR Evaluation Report page, from the Select export location menu, choose the NLU annotation set that corresponds to the ASR results that you want to evaluate.
Click the Export and go button.
The NLU evaluation tool appends the actual transcriptions of the ASR evaluation to a new or existing NLU annotation set and displays a "Success" notification.
Next, use the NLU evaluation tool to compare the actual transcriptions from the ASR evaluation to the skill model's expected mapping to intents and slots. See Batch Test Your Natural Language Understanding (NLU) Model.
If your Evaluation Report identifies areas where you would like to improve your skill's performance, see Improve your Automatic Speech Recognition (ASR) Test Results.