Display Cards Overview

Display cards for Alexa allow products to render visual content, including calendars, weather, and shopping lists. If your device has a display, you can enable the Display Cards capability in the Alexa developer console to begin receiving directives with visual metadata. For example, after you've enabled Display Cards for your product, when a user asks, "Alexa, what is the weather?", in addition to receiving a Speak directive with Alexa text-to-speech (TTS), the device receives a RenderTemplate directive with visual metadata that maps to design templates provided by Amazon.

Enable display cards

To enable display cards, declare the TemplateRuntime version 1.0 interface in your call to the Capabilities API. For more details, see Capabilities API.

Flow and delivery

This diagram illustrates the high-level message flow for delivering visual metadata to an AVS-enabled product.

Data flow diagram.
Click to enlarge
  1. A user asks, "Who is Usain Bolt?". Their speech is captured by your product and streamed to AVS.
  2. AVS returns two directives:
    • A Speak directive that instructs your client to play Alexa TTS.
    • A RenderTemplate directive that instructs your client to display visual metadata – in this case, information about Usain Bolt.
  3. Playback of Alexa TTS starts.
  4. The RenderTemplate directive is rendered immediately (and if possible, in tandem with the Speak directive) in a separate thread.
  5. Your client informs AVS that your product has started to playback Alexa TTS by sending a SpeechStarted event.
  6. When playback of Alexa TTS finishes, a SpeechFinished event is sent to AVS.

A tablet-sized screen would display something similar to the following example image:

BodyTemplate2
Click to enlarge

TemplateRuntime directives

The TemplateRuntime interface exposes two directives for the delivery of visual metadata:

  • The RenderPlayerInfo directive instructs your client to display visual metadata associated with user requests for audio playback, such as music. In addition to sending Play directive, AVS sends a RenderPlayerInfo directive with visual metadata specific to an audio content provider that the client binds to a template and render for the user.
  • The RenderTemplate directive instructs your client to display visual metadata associated with a user request for a for static display cards. For example, when a user asks "Alexa, what's the weather in San Francisco?". In addition to sending a Speak directive, AVS sends a RenderTemplate directive with visual metadata that the client binds to a template and render for the user.

Display card templates

The TemplateRuntime interface supports five templates, one for "Now Playing" and four static display cards. The visual metadata provided in each directive maps to a specific component in each display card template.

"Now Playing" visual metadata is always delivered as a RenderPlayerInfo directive and maps to the "Now Playing" template for music.

The RenderTemplate directive delivers static display cards and always includes the type parameter. This instructs your client to map the provided visual metadata to a specific template type:

Type Description Use Cases
BodyTemplate1 A text-based template that supports title, subtitle, text, and skill icons. Wikipedia entries without images, and cards provided by Alexa Skills.
BodyTemplate2 A template with support for body text and a single image. Wikipedia entries with images.
ListTemplate1 A template for lists and calendar entries. Shopping lists, to do lists, and calendar entries.
WeatherTemplate A template designed to display weather data. Weather

Dissecting a RenderTemplate directive

For example, consider a scenario where a user asks Alexa a question, such as, "Who is Usain Bolt?" Alexa then returns a RenderTemplate directive instructing the device to display visual metadata for the user. The directive payload supplies important information like mainTitle, subTitle, skillIcon, textField, and image, which map directly to specific components in the display card template.

The following example shows a sample payload:

{
  "directive": {
    "header": {
      "namespace": "TemplateRuntime",
      "name": "RenderTemplate",
      "messageId": {{STRING}},
      "dialogRequestId": {{STRING}}
    },
    "payload": {
      "token": "{{STRING}}",
      "type": "bodyTemplate2",
      "title": {
        "mainTitle": "Who is Usain Bolt?",
        "subTitle": "Wikipedia"
      },
      "skillIcon": null,
      "textField": "Usain St Leo Bolt, OJ, CD born 21 August 1986..."
      "image": {
        "contentDescription": "{{STRING}}",
        "sources": [
          {
            "url": "https://example.com/usain_bolt.jpg",  
            "size": "LARGE"
          }
        ]
      }
    }
  }
}  

The first payload parameter is the token, which corresponds to the Speak directive that includes Alexa TTS, with the RenderTemplate directive that includes visual metadata. Next, the type parameter describes which display card template to choose, such as bodyTemplate2. The remaining parameters include the visual metadata that you must bind to the template. Here's a sample of the visual metadata bound to the template specification for tablet-sized screens:

BodyTemplate2 with visual metadata mappings.
Click to enlarge

Rendering instructions

Your display card implementation must be aware of playback state, such as active playback, stopped, or paused. In addition to the guidance provided in the Interaction Model, your device must enforce the following rules when rendering visual metadata:

  1. Read the response on the request thread and parse the directives:
    • Run directives without a dialogRequestId on a new thread.
    • Run RenderTemplate directives on a new thread.
    • Place the directives with a dialogRequestId in your queue.
  2. Handle directives in the queue in a separate thread and address them sequentially.
  3. Sync Play directives with their associated RenderPlayerInfo directives. Use the sequence of Play directives to define the queue for playback. For example, If you send a PlaybackNearlyFinished event and then receive new Play and RenderPlayerInfo directives, add these directives to the queue. Handle this pair of directives after the current track completes.

UX considerations

Determine how best to render visual metadata for your users. The AVS UX Design Guidelines for Display Cards provide screen specific guidance for TVs, tablets, and low resolution screens. The guidelines include requirements for binding metadata and recommendations for display card transitions, interruption behaviors, and the presentation of Alexa states.

For more details, see AVS UX Design Overview for Display Cards.

Next steps