About the Automatic Speech Recognition (ASR) Evaluation tool
The Automatic Speech Recognition (ASR) Evaluation tool allows you to batch test audio files to measure the ASR accuracy of the skills that you've developed. With the ASR Evaluation tool, you can batch test your test sample audio utterances against ASR models and compare expected transcriptions with the actual transcriptions. The tool generates an evaluation report with accuracy metrics and pass/fail result for each test utterance, which you can use to resolve accuracy issues.
To learn more about ASR, see What Is Automatic Speech Recognition?.
This page provides an introduction to using the ASR Evaluation tool through the Alexa developer console and an introduction to the available ASR APIs, if you prefer to run ASR Evaluations programmatically.
Benefits of ASR Evaluation
If the users of your skill aren't getting their expected responses from Alexa, ASR Evaluation can help you troubleshoot speech recognition issues and improve skill performance. ASR Evaluation can help pinpoint commonly misrecognized words for your skill. You can then potentially improve recognition accuracy for those words by mapping them back to the skill model as sample utterances and slot values.
For example, if you have a coffee-related skill where you expect users to ask Alexa to "order a mocha," ASR evaluation results might show you that sometimes Alexa misunderstands the word "mocha" as "milk." To mitigate this issue, you can map an utterance directly to an Alexa intent to help improve Alexa's understanding within your skill.
Overview of the ASR Evaluation process
Use the following process to run the ASR Evaluation tool and improve the speech recognition accuracy and interaction model for your skill:
- Create an annotation set of recorded utterances to use for testing.
- Run the ASR evaluation tool.
- Use the results from an ASR evaluation to improve your skill's accuracy and interaction model.
ASR APIs
If you prefer to create your annotation sets and run the ASR Evaluation programmatically instead of using the developer console, Amazon also provides a set of APIs for these tasks.
Available ASR APIs
The following ASR APIs are available:
- Create annotation set API – Call this API to create an empty annotation set. Fill the annotation set with pre-recorded utterances by calling Update annotation set annotations API.
- Delete annotation set API – Call this API to delete a specified annotation set.
- Delete ASR evaluation API – Call this API to delete the specified ASR evaluation, including in-progress evaluations.
- Get annotation set contents API – Call this API to download the annotation set contents in text/csv or application/json format.
- Get annotation set metadata API – Call this API to return the metadata for the specified annotation set.
- Get ASR evaluation results API – Call this API to return detailed ASR evaluation results.
- Get ASR evaluation status API – Call this API to return high level information about a specified ASR evaluation run.
- List all annotation sets API – Call this API to list all annotation sets for a given skill.
- List ASR evaluations API – Call this API to return historical Automatic Speech Recognition (ASR) evaluations.
- Post ASR evaluation API – Call this API to run ASR evaluations against an existing annotation set.
- Update annotation set annotations API – Call this API to update the annotations included for an existing annotation set.
- Update annotation set property API – Call this API to update the name of an existing annotation set.
API call flow
The following process describes the expected order to call your APIs to run an ASR evaluation:
- Create your audio catalog:
- Call Create a catalog to create your new catalog.
- Call Associate a catalog to a skill to associate your new catalog with the skill that you're evaluating.
- Create the upload for your catalog by calling Catalog content upload.
- Upload your .zip file of audio files to the S3 URL returned by Catalog content upload. Audio files must be .mp3, .wav, .aiff, or .ogg format.
- After the upload completes, call Complete Upload.
- Call Get Upload to get the ingestion status of your upload.
- Create your annotation set:
- Call the Create annotation set API to create your empty annoation set.
- Call the Update annotation set annotations API to add your uploaded audio utterances to the annotation set.
- Run the ASR evaluation by calling the Post ASR evaluation API.
- Return your ASR evaluation status and results by calling the Get ASR evaluation status API and Get ASR evaluation results API.
ASR API error codes
For a reference of errors that can apply to all ASR APIs, see Automatic Speech Recognition (ASR) API Error Reference.
Related topics
- What is Automatic Speech Recognition?
- Create an Annotation Set for Automatic Speech Recognition (ASR)
- Run an Automatic Speech Recognition (ASR) Evaluation
- Improve your Automatic Speech Recognition (ASR) Test Results
- Create annotation set API
- Update annotation set annotations API
- Post ASR evaluation API
- Get ASR evaluation results API
- Create catalog API
- Batch Test Your Natural Language Understanding (NLU) Model