Display and Behavior Specifications for Alexa-Enabled Devices with a Screen
For Alexa-enabled devices with a screen, graphical elements complement the voice interaction on the device itself. Thus, a custom skill can include an interactive touch display in its response, in addition to standard voice interaction. For example, a recipe skill can display images of the ingredients and the preparation process at the appropriate points in the skill interaction. A city guide skill can display pictures or videos of requested attractions, or take the user on a guided tour.
Although the display component may enhance the user experience considerably, voice continues to be the primary interaction method with Alexa.
Cards, which provide additional information for the user beyond the Alexa voice response, are supported in the Alexa app, but the app requires the user to separately view the card in the Alexa app on their phone or computer. Thus, cards, while useful, are not meant to be part of the main workflow of the skill. With Alexa-enabled devices with a screen, the screen displays may be a significant aspect of the skill's main workflow.
Designing a skill for Alexa-enabled devices with a screen allows integration of voice, touch, text, images, and video.
For general guidance on creating a skill see Steps to Build a Custom Skill.
To create a skill with screen support, see Display Interface Reference.
To include video in your Alexa skill, see VideoApp Interface Reference.
For best practices, see Best Practices for Skills With Screen Display.
You can use the Node.js SDK and the Java SDK to facilitate the creation of Alexa skills.
Display and Interaction Features for Screen Devices
Skill developers can take advantage of the following features for display and navigation.
Display specifications for Alexa-enabled Devices with a Screen
|Device||Width (px)||Height (px)||DPI*||Shape||Input Type||Aspect Ratio|
|Fire TV Cube||1920||1080||320||Rect||dpad||16:9|
|Fire HD 8||1200||800||320||Rect||Touch||16:10|
|Fire HD 10||1920||1200||320||Rect||Touch||16:10|
* pixels per inch
Display Differences Among Alexa-Enabled Devices with a Screen
See Display Template Elements for a template-by-template comparison between the screen displays for the two devices.
You are expected to use a single design for all screen displays. Keep these specific display behaviors in mind as you design your screen displays. Alexa Simulator is useful for testing Echo Show or Echo Spot if you cannot test directly with these devices.
Because of its smaller size and screen shape, Echo Spot has the following unique characteristics, compared to other Alexa-enabled devices with a screen:
- Hints are not displayed in any template on Echo Spot. If a hint is part of the template, it will be hidden automatically on Echo Spot.
- List item images within horizontal lists (ListTemplate2) are rendered in the background with the list content layered over top.
- Background images will scale down to 480 pixels on the shortest side (most likely height) while maintaining the aspect ratio. The image will then center within the available viewport. In some cases, this means the edges of the background image will be cropped.
- Header text is limited to two lines of text and will truncate with ellipses after that.
- Not all templates will display header text on Echo Spot, such as BodyTemplate2 or ListTemplate2. See the template descriptions for details.
- Inevitably, Echo Spot has more text wrapping than other screen devices due to the smaller screen size. Text-wrapping may produce undesirable line lengths, but this is by design. Changing default font sizes affects all device types so it is recommended that the font size not be altered.
- Echo Spot does not have a back button.
For a Fire HD8 or Fire HD 10, the customer can use the device to interact with Alexa in two different ways. By default, the screen content will be in the form of cards, as in the Alexa app, and the interaction will otherwise be similar to a device without a screen. The customer can also switch the device to Show Mode, which enables on-screen content and behavior similar to Echo Show.
Screen displays for Alexa devices support the following graphical features.
Body templates - display content which can include text and images with predefined formats
List templates - display content as horizontal or vertical lists with predefined formats
Images - used in list templates or body templates
AudioPlayer - appears when the skill plays audio
Video - appears when the skill plays video
Touch selection events - List items and action links may be activated by touch if the skill is coded to support that
Alexa skills continue to support cards, whether the device with Alexa has a screen or not. If a skill is programmed so that a card appears in the Alexa app, and the skill has not been programmed to use a screen template, then that card will appear on the screen of the Alexa-enabled device with a screen. If the skill has no card or template display, then the skill name and the skill icon appear on the screen, with the hint text "When you’re ready to quit, try “Alexa, exit'".
Types of Interactions With Alexa Custom Skills
Custom skills designed for Alexa must take the following interactions into account.
Voice interactions. Voice remains the primary means of interacting with Alexa, even if you have a display. If your skill requires a screen to be used effectively, such as a photo-browsing skill, create a workflow that informs customers who are not using a supported screen device of the need for a screen. A user can control screen display by speaking the following actions.
Alexa app interactions. A custom skill may cause a card with more information to be displayed in the Alexa app. If the custom skill is used with a supported screen device, this card also appears on the screen if the response does not also include a display template. Thus, a skill that includes no display templates will show all of its cards on Alexa-enabled devices with screens. If a response includes both a card and a display template, the display template appears on the screen. This display template remains on the screen until the next response that includes a card or display template is sent.
Screen display interactions. If the custom skill uses display templates, and the correct interaction triggers a display template, then the corresponding text and images are displayed on the screen.
Screen touch interactions. When a customer touches an item on screen that has been encoded with a select intent, that will trigger a specific action that has been programmed in the skill, such as displaying a recipe that corresponds to the selected item.
Audio, Display, and Video Capability for Devices With Alexa
Devices with Alexa support the
AudioPlayer interface to play audio files.
Echo Show and Echo Spot support the
Display interface to display content on the device screen, and the
VideoApp interface to play video files on the screen.
How and When to Use Display Templates to Render Screen Displays
When you develop a custom skill, you determine the form of the response that your skill will send to Alexa. This response may be voice-only, or it may also include a card or screen display. The display template you want to use, if any, is included in the JSON-formatted response, just as speech output and cards can be included in the response. If both a display template and a card are included in the response, the display template is rendered to the screen. If only a card is included, then the card is rendered to the screen. If neither a card nor a template is specified, then a body template is rendered to the screen, which has the skill name and the skill icon. Display templates are not rendered in the Alexa companion app.
In general, you, as the skill developer, want a visually uncluttered experience for the skill user, with display templates effectively displayed when they enhance the user experience. Typically, you should only return display templates when responding with information that the user has requested. Other responses, such as questions to ask the user for more information, do not typically include display templates.
Create a multi-modal interaction model for your Alexa skill
The VUI (Voice User Interface), combined with the GUI (Graphical User Interface) and touch elements of Echo Show and Echo Spot, provide a unique user experience. . Additionally the VUI, combined with the GUI and lean-back living room environment provide a unique user experience for FireTV Cube. When you design your skill, consider how all of these elements will work together for the users of your skill.
The flow of the skill becomes significant when considering what the effect of such commands as "Up" mean. For example, in a recipe skill, would the user go back a previous step in a recipe, or to a previous recipe? Similarly, if the user says "Up" in a fact skill, what should the behavior be, if any? As the skill developer, you must specify this behavior in the service code for the skill. Plan how you want the user to interact with your skill.
Consider how your skill works across devices
If you design your skill for a particular device, such as Echo Spot, remember that the user may choose to enable this skill on other Alexa devices both with or without a screen or touch features as well. So determine if you want your users to have access to these features in a way that does not use the screen and touch features, or if you want to inform the user that the skill, or certain features of it, is supported only on particular devices. For example, the
VideoApp interface, which allows videos in a skill, is not supported on devices without a screen such as Amazon Echo.
Upgrade an existing skill for Alexa-enabled devices with a screen
You may have already created a certified skill that works with Alexa-enabled devices without screens. You can modify this skill to add additional functionality to take advantage of the display templates and touch features.
To upgrade your skill for Alexa-enabled devices with a screen, follow these steps.
Rethink the workflow of the skill. When a customer interacts with your skill using a screen, the experience will be different. Determine how your skill will work both with and without screens.
Display.RenderTemplateinterface as described in Display Interface Reference.
Modify the code of the skill service to reflect the new workflow you have designed, as well as preserve the experience for users who do not have an Alexa-enabled devices with a screen.
Test the skill with its new features.
Publish the changes.