Test Wake Word Detection Delay (WWDD)
"Wake Word Detection Delay" (WWDD) is the minimum time that a particular device requires a user to pause between saying "Alexa" and the request in order for the Alexa Voice Service to reliably receive the entire request. For on-device clients with microphones (Alexa Built-in devices), Amazon requires that the WWDD is no more than one second. Ideally, the device audio input is implemented in such a way that the WWDD is zero.
- Test room configuration
- Test files
- Test steps
Consider this example: "Alexa, cat food to my shopping list."
The device must hear all of the request: "cat food to my shopping list." If the device hears any less than that (i.e. "food to my shopping list", "to my shopping list"), AVS will not provide an accurate response for the user's intent. If the device requires significant delay to accurately capture the whole request, saying the wake word will be a poor experience for the customer.
For more guidance, see Enable Cloud-Based Wake Word Verification and Requirements for Cloud-Based Wake Word Verification.
Test room configuration
For WWDD testing, place the Speech Speaker in only one location: 0.9 m from the device along the 90-degree path. To ensure the DUT is able to respond, say something like "Alexa, what time is it?"
For full instructions, learn how to Set up your Test Environment.
To test a device for WWDD, Amazon provides several utterance audio files with the same request, but with increasing amounts of delay between "Alexa" and the start of the request. For example: "Alexa, (increasing delay here) cat food to my shopping list."
During testing, you'll open an audio file with a particular delay built into the utterance and play this audio file 10 times. Each time, use the scoresheet to characterize the waking and the response of the device.
How to use the utterance files
As you play the audio files, you'll discover which of these files is the first to consistently yield the expected response. The filename of the audio file indicates the time delay, and thus you know the minimum time delay, the WWDD, for the device being tested.
The following utterances are provided for testing in each locale:
|en_US, en_CA, en_UK, en_IN, en_AU||"Alexa, cat food to my shopping list"||"I've added cat food to your shopping list"||Shopping|
|fr_FR||"Alexa, trois mille moins cinq"||"deux mille neuf cent quatre-vingt-quinze"||Calculator|
|ca_FR||"Alexa, tremila più uno"||"deux mille neuf cent quatre-vingt-quinze"||Calculator|
|de_DE||"Alexa, dreihunderteins plus eins"||"dreihunderteins plus eins ist dreihundertzwei"||Calculator|
|it_IT||"Alexa, trois mille moins cinq"||"tremila e uno"||Calculator|
|es_ES||"Alexa, tres mil más uno"||"tres mil más uno es tres mil uno"||Calculator|
|es_MX||"Alexa, tres mil más uno"||"tres mil más uno es tres mil uno"||Calculator|
How to use the scoresheet
Open the "WWDD" tab of the scoresheet. The row headings on the left are varying amounts of delay. These numbers appear in the file names found in the WWDD_utterances folder on the Speech Laptop (defined in the "Equipment" section).
Here is how you characterize the waking and response:
- 1: The response matches what is expected (i.e. "I've added cat food to your shopping list").
- p: The response shows that the start of the request was missed, but the domain is correct (i.e. "I've added food to your shopping list").
- w: The device woke up, but it had no response. Or, the response was completely wrong and not in the expected domain.
- 0: The device did not wake up.
In the following example, for the audio file with the "0.00" delay in the utterance, the DUT frequently missed the start of the request and returned an incorrect response within the correct domain. The DUT first achieved all 1s for the 0.05 second delay. As support for the prior result, testing was repeated with 0.10 second delay. Therefore, the official WWDD value for this device is 0.05 seconds.
In these steps, assess the DUT for WWDD by playing each audio file 10 times, then annotating the results in the scoresheet provided.
Prepare the test files
To play the WWDD utterance files: On the Speech Laptop, open the WWDD_utterances folder. To ensure the device responds to an audio file, repeat the following steps as needed:
- Load and play the audio file that has 0.2 seconds of delay.
- Adjust the volume so that an SPL reading at the microphone array of the DUT is roughly 75 dBC. (The intended result is that the device has no difficulty hearing the request under ideal conditions.)
- Open the scoresheet.
- In the "Information" tab of the scoresheet, enter a name for your device in place of "Name DUT".
- Select the "WWDD" tab. (The name you typed in the "Information" tab now appears in several other places throughout the scoresheet, such as the "WWDD" tab.)
- In the "Information" tab, fill in all remaining fields except for "Wake Word Detection Delay"). To find the "Serial Number," you can use either of these approaches:
- Log in to the Amazon.com account associated with the DUT. Visit Manage Your Account and Devices, then Your Devices, then Actions. There, you should find the serial number.
- In the Alexa app account that the device is registered to, visit Settings, then Devices, and select the DUT from the list. At the bottom of the page, you'll find the serial number.
Test for each audio file
- Load and play the WWDD utterance audio file with the shortest delay (0.00 seconds).
- In the scoresheet, characterize the response (1, p, w, 0).
- For this same audio file, repeat this process - playing the file and characterizing the response - 9 more times.
- Using the next (longer delay) audio file, repeat these steps until your scoresheet contains a row with all "1s."
- Repeat these steps with one more audio file. This test is complete when you have two rows with all "1s" in your scoresheet.
- In the "Information" tab of the scoresheet, in the WWDD row, note the delay associated with the first row of all "1s."
View the results
For your device to pass, the recorded wake word detection delay must be no more than 1 second. Ideally, the device audio input is implemented in such a way that the WWDD is effectively zero.
|WWDD - Wake Word Detection Delay||No more than 1 second|
See Amazon's passing criteria for all acoustic tests.