Alexa is available on a range of device modes, which means you can design and build a single voice-optimized experience by using Alexa Presentation Language (APL) that easily scales and adapts to every Alexa-enabled device. From smart speakers to light bulbs to widescreen televisions, customers can interact with voice experiences on many different devices. Each device mode has its own form factor, context of use, and set of customer expectations. By following these guidelines, you can bring high-quality visual experiences to any Alexa-enabled device.
Speakers are the category of Alexa-enabled audio devices without a screen, though some might offer a compact or low-resolution screen. Customers primarily use speakers to listen to audio. They can be standalone devices or those connected to other devices, such as a soundbar connected to a TV.
All speakers have a microphone and at least one speaker to enable voice interactions. Occasionally speakers can be connected to other speakers to create a multi-room music, or to other screened devices like TVs to create theater-like home experiences. Typically, however, speakers are considered to be voice-only experiences because microphones are the primary input.
Speakers should generally be considered to support voice-only experiences, as most speakers with screens have low-resolution displays that are not sufficient to support the visual experience for an Alexa skill.
Depending on the speaker's location, it might be used by one person or shared between many people. Because the device is typically without a screen to focus attention, customers often multi-task while listening to the device. Speakers are often placed in communal areas of the home or in the office.
Depending on the environment, customers can roam farther away from speakers due to their ability to be used with audio only.
Settings and authentication
Because speakers often do not have a screen available, or the screen available has limited capabilities, authentication is done in the Alexa App. If your skill will need access to settings or account authentication in order for the customer to use all its features, the skill should instruct the customer to go to the Alexa app.
Smart displays, such as the Echo Show or Echo Spot, intercoms and other fixed home devices that are used for music, communications, and entertainment. Hubs have a wide variety of screen sizes and often have touch capabilities.
All hubs have a microphone, camera, and touchscreen. Don't rely only on the touch screen in your skill. Most customers use a hub at a wide variety of distances and voice should be considered the primary input method.
Hub devices have a wide range of screen sizes. Screens are often regular, in landscape orientation, but can also be 1:1 and circular shaped. Using the out-of-box device groupings for a hub can make it easier to target these typical screen sizes. These include: Hub, round small; Hub, medium landscape; and Hub large landscape.
Hubs are used in a wide variety of environments and audiences. Depending on the hub's location, it might be used by one person or shared between many people. When designing for hubs, you can't depend on the customer always focusing on the device. Often, hubs are placed in busy areas of the home or office so a customer might not be looking or interacting with the screen when using a skill.
The viewing range of most hubs, due to their smaller size, is between 2-7 feet. If a customer chooses to sit next to the device to touch it, they'll be in that 2 foot range, while if a customer is working on other tasks or moving around- they might glance at the device from a 7 foot range. Design for the farthest reasonable distance in your experience as the default sizing for images and fonts.
Televisions, set-top boxes and projectors that are primarily used for entertainment. Televisions have a range of screen sizes and aspect ratios, can use touch and a remote as inputs, and can have additional speakers connected to create a home theater experience.
Alexa-enabled TVs have a microphone to enable voice interactions, in addition to a screen, speakers, and remote. However, a device can range from voice-initiated or touch-initiated using press-to-talk on a remote. Because all TVs have a remote, or 5-way controller, make sure you consider showing selected states for controls. You should also consider that typing on a 5-way can be cumbersome and voice should be considered the primary input method where appropriate.
Unlike traditional TVs, smart TVs can be also connected to the internet to access streaming media services, entertainment apps, and web browsers. Newer TVs can also pair with speakers and supporting home devices that can manage household functions as well.
TVs devices have a wide range of screen sizes, aspect ratios, and density ranges. Screens are generally in landscape orientation. Make sure you understand the different densities so you deliver the appropriate optimized imagery.
UI elements and images should use the off-white (#FAFAFA) instead of pure white (#FFFFFF). Pure white feels harsh on large screens. All of the important UI elements need to be displayed within the TV safe area to avoid overscan issues where TV manufacturers scale content slightly at the edge of the frames. While less typical on newer TVs, keep this in mind as you design for all TVs in this category.
TVs can range in usage patterns. Customers might watch TV alone, with other members of their household, or in groups for big live events. Because of this, TVs are very communal with common use cases where multiple customers are using the device simultaneously.
Since the distance to the device is often 10 feet or more, often TVs will use voice instructions, remotes, and hardware switches to interact with the device. When designing for this type of device, visual elements should be large and clear enough to be visible at at 10 foot distance. Keep your layouts friendly for use with remotes that are not voice enabled.
Consider that customers might be in a relaxed state, either sitting or lying down while consuming entertainment. Ensure that the attention system leverages cues like media playing to use the correct visual cues and sound files.
Settings and authentication
TVs use a constrained model for settings and authentication, where quick tasks are done on the device and more complex tasks like authentication are done using code based linking.
Adapt your experience
Customers are increasingly using Alexa on more than one device. Some customers have a hub with a bigger screen in their kitchen and a use a small-screen hub in their bedroom. Others have a TV connected to speakers or a soundbar. When thinking about the multi-device experience, consider that customers might be going between devices and might be using devices operating together in a connected group.
The easiest way to create a consistent experience is by starting with voice-only experiences. This brings your experience across all device categories, from speakers to TVs. Once you consider the voice-only experience, then you can understand how consistent you want your visual content to be. One of the easiest ways your experience can look visually consistent across devices is through styling. Using the same colors and iconography gives your customer an instant way to visually identify your product and brand across a range of devices.
When adapting content across screens, start with the smallest screen size first. This will help you prioritize what is the most important content you need to show at any given time. It also helps you think through touch targets and how closely you can space elements on the screen. When you start with small screens, you can prioritize which devices you'll design for and what your customer experience should be.
Once you've vetted your experience across the smallest screens, it's good to do the inverse and consider your largest screen sizes. This typically covers larger hubs and TVs. When targeting devices with larger screens, it's more than a simple exercise to scale the content up; large screens should take full advantage of the additional screen real estate, and you will need to pay special attention to image quality so that images do not lose their quality as they scale up.
Now you're set to adapt your content responsively across device categories. Leverage responsive components and templates that have been designed to automatically work for you across all device modes. These are out-of-the-box solutions that work across all Alexa-enabled devices.