In the vehicle, customers can invoke Alexa by saying the wake word or pressing a button to begin speech dialogue. To communicate that she is listening, Alexa uses sounds (earcons) and visuals (voice chrome). Voice chrome also indicates when Alexa is thinking and speaking.
There are 2 primary ways to invoke Alexa in the vehicle. Both are required.
- Saying the wake word “Alexa”, also known as Alexa Hands-Free
- Pressing the Push-to-talk (PTT) button to directly invoke Alexa without the wake word.
(Required) Support Alexa Hands-Free (wake word) to invoke Alexa.
The wake word provides hands-free, voice-forward experiences with Alexa. Minimizing the need for drivers to view or touch the screen helps to reduce the visual (eyes off of the road) and manual (hands off the wheel) distractions in the car. Customers can turn off Alexa Hands-Free (wake word) in settings (see Menu and settings for details). Customers must first enable Alexa before they can begin speaking with her.
(Required) Enable invocation of Alexa only after the customer has completed Alexa setup.
To ensure customer privacy, don’t enable wake word or dialogue with Alexa until the customer has enabled Alexa in setup. See Setup for details.
(Required) Provide customers a way to disable Alexa hands-free under the Alexa menu.
Customers may choose to disable the Alexa wake word. However they may still access Alexa using a button press (PTT). See Menu and settings for details.
(Required) Indicate to the customer when Alexa Hands-Free is disabled.
To indicate that Alexa Hands-Free (AHF) is off:
- Show the red Alexa voice chrome for 5 seconds when the customer turns off AHF
- Display the AHF Off icon in the status bar of your IVI screen.
PTT is another way customers can invoke Alexa. If Alexa Hands-Free has been turned off, customers should still be able to use PTT to speak to Alexa in their vehicle.
(Required) If the vehicle offers a PTT button, customers must be able to invoke Alexa via PTT without a wake word.
If a customer assigns Alexa as the default voice assistant, use a short press on the PTT button to invoke Alexa without wake word. Alexa should be invoked immediately (within 250 ms) after the PTT button has been pressed and released.
If Alexa is not set as the default assistant for PTT, it’s recommended to still allow the customer to say “Alexa” after having pressed the PTT as another way to speak with Alexa.
A customer can use the Tap-to-talk button to invoke Alexa immediately with one tap and without needing to say “Alexa”. The Tap-to-talk button should behave similarly to the Push-to-talk button.
(Required) If PTT is not available, TTT (Tap-to-Talk) via an on-screen button can be used to invoke Alexa.
(Required) Place the TTT persistently in a single location with a consistent style.
The Tap-to-talk button uses the Alexa logo inside a circle. We recommend hex color #05A0D1 for the circle fill. Other acceptable options are any of the Alexa brand colors.
(Required) Allow Alexa to be invoked while mobile projection applications are running or other assistants are present (e.g. Android Auto or Apple CarPlay)
(Required) Allow customers to interrupt Alexa when she is speaking (barge-in).
Customers must be able to interrupt Alexa with all available invocation methods. When interrupted, Alexa will stop speaking and start listening. For example, when Alexa is speaking about the weather, the customer can barge-in with wake word, PTT or TTT and say “will it rain tomorrow?”
(Required) Allow customers to cancel listening and speaking.
Customers can stop Alexa from listening by saying “cancel” and by pressing the PTT or TTT button during the listening state. When Alexa is speaking and the customer closes a display card, Alexa’s speech should be stopped. See the table below for more interruption behaviors.
This table shows how interruptions are to be implemented for interactions where Alexa is already listening or speaking.
|Wake word||Start listening||No change||Barge-in||Barge-in|
|PTT press||Start listening||Cancel listening||Barge-in||Barge-in|
|TTT press||Start listening||Cancel listening||Barge-in||Barge-in|
|Presses a cancel, back or close button||-||Cancel listening||Cancel dialog||Cancel dialog|
|Dismisses the display card||-||-||Cancel dialog||Cancel dialog|
|Touches the screen (e.g. to scroll text or launch an app)||-||Listening continues||Thinking continues||Speech continues|
The Alexa attention system
Alexa is a single personality that is coherent and familiar to customers across many devices. While the physical devices might be different, the attention system ensures Alexa behaves predictably and with familiarity. This consistency creates customer trust and strengthens the customer’s understanding of Alexa.
Alexa’s attention system is comprised of non-verbal audio and visual components that work together to communicate all of Alexa’s different states to the customer. Color, sound, and animation are critical for effectively communicating Alexa's state. Audio and visual cues must be synced so that Alexa’s state change indicators occur simultaneously as the customer wakes, speaks to, and listens to Alexa.
Alexa earcons are sound cues that play at the beginning and end of speech input. They help inform the customer when Alexa is listening. Amazon provides an Alexa Sound Library which can be downloaded from the Alexa Auto Design toolkit . Note that there are distinct sounds to use for touch vs. wake word start of listening (see Examples below for details). Only sounds that must be stored in your vehicle’s head unit are provided. All other sounds are part of the Alexa response.
(Required) Play the Alexa earcons at the beginning and end of speech input.
This allows the customer to know when Alexa is listening without looking at the screen. To ensure a good customer experience, the earcons and voice chrome should be displayed as quick as possible after invocation. Earcons are required to stay in sync, playing within 100ms of when voice chrome displays the corresponding beginning or end of listening state. Customers can turn off earcons in Alexa settings. See Menu and settings for more info and examples.
(Required) Use Alexa earcons only for Alexa features.
Don’t use Alexa's sound cues for any other interactions, including other speech systems or voice assistants.
(Required) Display the Alexa voice chrome when the customer invokes Alexa.
Voice chrome is a visual indicator of Alexa’s attention system and is displayed whenever the customer interacts with Alexa by voice. Use linear voice chrome, as it works best with Alexa’s Display Cards and does not obscure other on-screen content.
Voice chrome should reflect that Alexa is seamlessly integrated into the vehicle’s IVI and is not limited to a single app. Place voice chrome along the bottom edge of the screen as an overlay that does not cover the entire display. This provides a less jarring experience when invoking Alexa, and makes for a more seamless integration with the vehicle.
- Place linear voice chrome along an edge of the screen, preferably at the bottom.
- Don’t use a full-screen overlay or popup with voice chrome.
- Overlay any current IVI screens, e.g. Navigation.
(Required) Use only Alexa brand graphics to indicate that Alexa is listening.
Except for physical PTT buttons (e.g. on the steering wheel), don't use additional icons to invoke or represent Alexa. Use only Alexa icons and voice chrome to represent Alexa.
Attention system states
Attention states address the personality of Alexa at a high level across all domains. The Core Alexa states are: Idle, Listening, Thinking and Speaking. For products with visual cues, it is required that these states are distinguishable from each other.
The Idle state can be considered Alexa’s default state. No visual voice chrome elements are displayed in this state, in contrast with all other states. This communicates Alexa is passively waiting for a request and not actively communicating.
The Listening state starts when Alexa has been invoked via wake word, PTT or TTT and the microphone begins streaming the customer’s request to the Alexa Voice Service. There are 3 stages to the Listening state:
- Start Listening - Alexa transitions from Idle to the Listening state and waits for a request from the customer.
- Active Listening - When the customer begins speaking, Alexa transitions into an Active Listening state. If she doesn’t hear anything from the customer, Alexa returns to the Idle state.
- End Listening - When the customer's end of speech is identified, Alexa transitions out of Listening state.
Note In multi-turn interactions the start of listening earcon only plays on the first turn of the interaction. In subsequent turns, the start of listening sound is not played. However the end of listening earcon should be played at the end of listening each time.
When a customer completes a request, Alexa enters the Thinking state. This state lets the customer know the microphone is no longer active and Alexa is processing their request.
The Speaking state is displayed when Alexa is responding to a request with text to speech (TTS). This state is not displayed when Alexa is responding with long running mixable media such as music, books and Flash Briefings.++
|State||Description||Voice chrome||Colors Blue #214CFB Cyan #05FEFE Red #FC361D||Icons||Earcons|
|Idle||Alexa is available through invocation methods. No visuals are displayed on screen.||No visual indicators.|
|Listening Start||Voice chrome appears and earcons play once when the customer wakes Alexa by voice or touch and the microphone becomes active. Different start listening earcons are used for voice-initiated and touch-initiated products.||Blue, Cyan||Touch Wakeword|
|Listening Active||Voice chrome persists while Alexa is capturing speech from the customer. When end of speech is detected, the end of listening earcon plays and voice chrome transitions to Thinking state.||Blue, Cyan||Listening Active|
|Thinking||Voice chrome plays in a loop while Alexa is processing, or 'thinking about' what the customer has said. Displaying this state ensures that the customer understands that the interaction has not ended.||Blue, Cyan|
|Speaking||Voice chrome plays in a loop while Alexa is responding to the customer via TTS.||Blue, Cyan|
|Microphone Off||Indicates that the customer has turned off hands-free listening. Display voice chrome for 5 seconds after customer turns off Alexa Hands-Free (AHF).||Red|
|Errors||Error state is shown when Alexa cannot be reached. The red voice chrome will show for 5 seconds and then dismiss.||Red||Offline Error prompt in Design Toolkit|
Note: Alexa voice chrome is available as part of the Alexa Auto SDK.
(Required) Disable Alexa in restricted modes such as valet or for guest drivers.
Customers expect Alexa to protect their privacy. Disable invocation and access to Alexa when restricted modes are activated in the vehicle’s system (e.g. valet mode).
Example: Driver pulls up to a hotel and quickly turns on “valet mode”. The valet gets into the vehicle and is unable to use Alexa because “valet mode” is enabled. This ensures the valet can not access the customers private information via Alexa.