Use Alexa Presentation Language

Key takeaways

Alexa has a visual design framework called Alexa Presentation Language (APL), which allows you to build interactive voice and visual experiences across the device landscape. This multimodal experience can make skills more delightful and engaging to the customer. You can design custom visual elements for standard Alexa-enabled devices such as the Echo Show, Fire TV, and select Fire Tablet devices. Learn more about APL in the Alexa Learning Lab.

 

Need quick advice?

View the Design Checklist for Alexa Presentation Language (APL) for tips on how to create a great skill experience with APL.

 

In this article:

line-break

About APL components

Amazon created APL so you can design custom experiences that combine voice, audio, and visual elements in a single customer interface. This framework is adaptable so one design can scale to multiple device types while keeping the visual and voice elements synchronized. There are many ways you can use APL to enrich the customer experience. With APL, you can provide customers with complementary information at a glance from across the room or offer visual clues, such as showing lists or search items. APL supports voice commands as well so that customers can ask for an item on screen instead of relying on touch interactions only. This gives your skill fluidity between interaction types, making customer interactions seamless and intuitive.

For more information about Alexa Presentation language, see Add Visuals and Audio to Your Skill.

line-break

Images

You can deliver images on screen with or without text that can be responsive to touch using TouchWrappers. You can apply filters to images, such as blur.


When placing components on top of images, use the overlay (scrim) to apply a colored opacity layer over your image to help with the legibility and accessibility of your content. When you want to de-emphasize an image, you can also change its opacity to create different effects. You can use images as …

  • Backgrounds: Background images create an enriching visual experience without interfering with the primary content. Place images into the background of your layout to provide texture to the primary content shown on screen.
  • Thumbnails: Use thumbnails to differentiate between search results or pair an image with a text component to provide additional context for an option.
  • Icons: Use smaller images to provide tertiary content, such as star ratings.

line-break

Touch Targets

Ensure that the touch target is tied to, and can be selected by, voice in addition to touch. If you wrap a text string in TouchWrapper, it's best if the string represents the phrase that will trigger the intent. Because touch wrappers are intended to be touched, we recommend a minimum size of 48x48dp, which creates a physical touch target of 9mm, regardless of screen size. You can use the TouchWrapper for …

  • Sequences and lists: Use the TouchWrapper to wrap items in your sequence so that customers can select each one using touch to view more detailed content.
  • Images: In combination with the image component, the TouchWrapper can be used to create navigational items, such as graphical buttons, or to add points of interaction and selection on images such as a game board.

line-break

Text and ScrollViews

When you show text, you can specify the text color, size, and weight for available fonts. You can use TouchWrapper and ScrollView to make your text touch-responsive and allow you to display it outside the bounds of the container. This enables customers to touch to scroll below the “fold” (or, the default viewable area of the screen before the customer scrolls"). (Note that APL does not support custom fonts


When you want to add or remove emphasis to text, you can change its color and opacity to help distinguish states, or primary and secondary content. Too much text on screen can distract from the voice experience and overwhelm the customer. Remember to …

  • When using the text component in a List, try to limit your text to 3 lines for each list item.
  • Limit the number of rich text formatting options to two styles (for example, bold and italic).
  • Use as few text sizes as possible to convey a strong sense of hierarchy and meaning to the message you want to convey.
  • Make sure to have strong contrast between the text color and background color to make it easier for customers to read your text, especially at a distance.

line-break

Slideshows (pagers & sequences)

You can use Pager to show a time-ordered sequence of items that typically advance automatically, such as slideshows. Or you can use Sequence to show a continuous list of choices, such as local restaurants, and allow customers to navigate the list via voice or by touch or remote control. (For most devices, touching the screen will pause pagination.)


Pager is best used for images or text that don't match exactly with the TTS that Alexa is reading, or for content that you don't want the customer to scroll through. For example, you can use Pager to automatically paginate through a carousel of images, or a series of cards displaying sports scores.

Critical information should be spoken. Customers may not be looking at the displayed content, or may miss items at the end of your presentation. Content presented using the Pager component works best when not combined with too many other layouts displayed on screen. Too many things happening at once can be distracting to customers.

With the Sequence component, you can place a list within your skill. Sequences are best suited for providing multiple options or results for a customer to chose from in a predetermined order. Only use one sequence per screen so that the customer understands how to control the sequence with voice commands.

  • Numbering each item in your sequence with an ordinal is important for enabling easy selection for the customer. Always be sure to start with 1 and increment by 1 throughout your sequence.
  • You can set a scroll direction of vertical or horizontal for your sequence. Sequences with text work best in a vertical scrolling orientation, while sequences with images work best in a horizontal scrolling orientation.

line-break

Video

You can include video content within your APL layouts to continue your skill experience when the skill completes media playback. You can customize the video playback as well as build in playback controls like play, pause, and rewind buttons. Always include closed captioning in your videos, and remember to include a screenshot to use as a static preview.


Provide a way to pause the video content by voice and by using an on-screen button or other control. Customers should always control the video playback experience unless there is a specific reason for the experience to control it. Whenever possible, allow the customer to choose to repeat or loop a video. Finally, allow the customer to use familiar terms to control playback using voice. At a minimum, provide a play, pause, and full screen button.

line-break

Design Checklist for Alexa Presentation Language

Use APL components & imagery strategically

Pair Alexa Presentation Language (APL) components with the appropriate voice interaction: Use lists for enhanced search and browsing via voice, for example. Consider which APL features would best serve your customers at a given step in their experience. The images (photography, icons, and other vector graphic) your skill uses on screen should be appropriate to the customer’s context.

 

▢  APL components should enhance the voice experience – they shouldn’t detract or complicate the customer experience for their own sake

▢  Visuals should be contextually relevant to the conversation; Don't surface imagery that conflicts with what the skill might be telling the customer

▢  Use images that would enhance the understanding and experience of the content

▢  Avoid using generic imagery that doesn’t add value to the customer’s experience

▢  Don't embed text in imagery, when possible (excluding logos)

▢  Use high-quality images that look crisp on a range of device sizes

▢ Scale images in a way that won't cause letterboxing on some devices


For more information about designing a great multimodal experience for your skill, see Multimodal design: Introduction.

Previous Article:
Next Article: