Display Interface Reference

To create a skill that supports screen display, see Create Skills for Alexa-Enabled Devices with a Screen.

This reference describes how to use display templates in the skill service code to achieve the look and feel that you want for your skill. Remember that many users will be using Alexa-enabled devices without screen support, so your skill should always be designed in a voice-first manner.

See also:

Display templates for skills that support screen display

To include screen displays in their skill, a skill developer must use display templates in the skill service code. These templates are constructed so as to provide a great deal of flexibility for the skill developer.

Each template has a JSON representation, and can be included as appropriate in the skill responses sent to the screen.

The Alexa Skills Kit provides two categories of display templates, each with several specifically defined templates:

  • A body template displays text and images. These images cannot be made selectable.
  • A list template displays a scrollable list of items, each with associated text and optional images. These images can be made selectable, as described in this reference.

These templates differ from each other in the size, number, and positioning of the text and images, as well as list-scrolling behavior, but each template has a prescribed structure. These templates have been carefully constructed to provide a consistent user experience.

When you, as the skill developer, construct a response that includes a display template, you specify the template, text, and images, so you have latitude to provide the user experience you want.

Template interfaces (JSON) and designs

See Display Template Reference for template specifications.

For the JSON interface for each of these templates, the strings for the text or image fields may be empty or null. However, list templates must include at least one list item.

The skill icon you have selected for the skill in the Launch > Store Preview section of the developer console appears in the upper right corner of every template automatically, and is rescaled from the icon images provided in the developer console. You can change this skill icon as desired.

Each body template adheres to the following general interface:

  • Body Template Interface
{
  "type": "string",
  "token": "string"
}

Each list template adheres to the following general interface:

  • List Template Interface
{
  "type": "string",
  "token": "string",
  "listItems": [ ]
}

Form of the Display.RenderTemplate directive

The template attribute identifies the template to be used, as well as all of the corresponding data to be used when rendering it. Here is the form for a directives object that contains a Display.RenderTemplate directive. The type property has the value of the template name, such as BodyTemplate1 in this example. The other template properties will differ depending on the type value.

Click button to view.

See the formats for other templates in the Display Template Reference.

Display.RenderTemplate and other directives in response

For context, a response body that includes a Display and Hint directive is shown below. Other directives can also be included. A Hint directive requires that a display template also be included. Note that BodyTemplate1, BodyTemplate3, and ListTemplate1 do not support the Hint directive.

An AudioPlayer and a VideoApp directive cannot be combined together.

The following response includes multiple directives.

Click button to view.

Example: Directive for BodyTemplate2 to include in a response

On Echo Show and Fire TV Cube, the rendered body template shown in this example will display the title "My Favorite Car", with a back button at the upper left, the skill icon at the upper right, and an image at the right, with the image scaled, if needed, to the appropriate size for this template. The back button and background image are optional, but are included here.

Click button to view.

Display template reference

See Display Template Reference.

GUI specifications for display templates

Ensure you follow these specifications to ensure your skill works correctly and looks good on Alexa-enabled devices with a screen.

See Display and Behavior Specifications for Alexa-Enabled Devices With a Screen for device specifications.

Image size and format allowed by display templates

The images that are referenced in display templates should meet the following requirements.

  • These images may be in either JPEG or PNG formats, with the appropriate file extensions.
  • The templates provide support for either square or rectangular images. Note the aspect ratio for each template shown in the table.
  • When including the image, it is recommended that you provide several different URL sizes. The image size selected will be the smallest possible size that matches the desired aspect ratio and gives a clear image for the size of the screen where the image is being rendered. See the display sizes in the following table. If you do not provide an image of the appropriate size, the next larger image size will be scaled down for use to fit the intended slot. Because scaling down may cause poor image quality, it is strongly recommended that you provide appropriately sized images.
  • For speedier responses and to manage latency issues, refer to the following table for guidance on image size. Images are compressed when sent, and decompressed when received.
  • You must host the images at HTTPS URLs that are publicly accessible.
  • For the best visual experience, ensure images are transparent. (Only images in PNG format can be transparent.)
  • By the judicious use of a transparent background, your images can appear to take on a wide range of shapes and sizes.
  • Background images with slight patterns or gradients are recommended to provide a consistent, high-quality appearance.
  • For background images, a 70% opacity black layer should be applied for optimal contrast between the image and text.

As shown in this table, the cumulative file size of all images in the skill should not exceed 3 MB. In general, keeping image sizes small reduces latency and provides a better customer experience.

Number of images in the skillIndividual image file size
10 ≤ 300 KB
6 ≤ 500 KB
2 ≤ 1.5 MB
1 ≤ 3 MB

Display image sizes for each template for Alexa-enabled devices with a screen

While the ASK runtime code will manage your skill execution so that you do not need to develop uniquely for a device, it is helpful to understand how your images will look on different-sized screens. The following table lists image sizes supported by each template on Echo Show and Fire TV Cube. Echo Spot images are scaled down as appropriate. With the backgroundImage field, the display size is the same as for the screen pixel dimensions for full-size images. Do not use smaller images which need to be scaled up, as they will have a poor appearance on a larger device.

For the image referenced in the image field, the sizes are as follows for each template.

Template Echo Show Display Size (pixels):
Maximum Height x Width
Fire TV Cube Display Size (pixels):
Maximum Height x Width
ListTemplate1 (vertical text)88 x 88110 x 110
ListTemplate2 (horizontal with text under image) Height should be 280 pixels. Depending on the aspect ratio desired, the width should be between 192 and 498 pixels. The following aspect ratios are supported (width x height):
  • Portrait (192 x 280)
  • Square (280 x 280)
  • 4:3 (372 x 280)
  • 16:9 (498 x 280)

Height should be 404 pixels. Depending on the aspect ratio desired, the width should be between 284 and 504 pixels. The following aspect ratios are supported (width x height):

  • Portrait (320 x 474px)
  • Square (404 x 404px)
  • 4:3 (380 x 284px)
  • 16:9 (504 x 284px)
BodyTemplate1 (full-width text)inline images onlyinline images only
BodyTemplate2 (image right)340 x 340576 x 576
BodyTemplate3 (image left)340 x 340576 x 576
BodyTemplate6 (full-screen image with text overlay)340 x 3401920 x 1080
BodyTemplate7 (full-screen image with full-screen image in background)

880 x 346 (main image)
1024 x 600 (full-screen image in background)

1712 x 788 (main image)
1920 x 1080 (full-screen image in background)

Image object specifications

The image object in the display templates takes the following format. Note that the image format used for images on cards is different.

The contentDescription property is text used to describe the image for a screen reader. The fields size, widthPixels, and heightPixels are optional. By default, size takes the value X_SMALL. If the other size values are included, then the order of precedence for displaying images begins with X_LARGE and proceeds downward, which means that larger images will be downscaled for display on smaller screens. For the best user experience, include the appropriately sized image, and do not include larger images.

Do not include the widthPixels and heightPixels integer values, which are optional, unless they are exactly correct.

{
  "image": {
    "contentDescription": "string",
    "sources": [
      {
        "url": "string",
        "size": "string",
        "widthPixels": integer,
        "heightPixels": integer
      },
      {
        "url": "string",
        "size": "string",
        "widthPixels": integer,
        "heightPixels": integer
      },
      {...}
    ]
  }
}

The values for size are listed in the following table.

Property Description Recommended Size (in pixels)
Width x Height
X_SMALL Displayed within extra small containers 480 x 320
SMALL Displayed within small containers 720 x 480
MEDIUM Displayed within medium containers 960 x 640
LARGE Displayed within large containers 1200 x 800
X_LARGE Displayed within extra large containers 1920 x 1280

Back button in templates

Echo Show and Fire TV Cube support back buttons on all templates, although the developer can choose to hide the back button. On Echo Spot, the back button does not appear at all, but the customer can achieve the same effect with a long swipe from the left edge of the screen, if the backButton object is set to "VISIBLE" as described in this section.

For some skills with a visual component, a back button allows the customer greater freedom to navigate through the skill. In other such as quiz games, the inclusion of a back button might cause incorrect or undesired behavior, such as if the customer uses the back button to return to a previously answered question. The skill developer can decide whether to include a back button, which appears at the upper left, on each display template used in the skill. The backButton field can be used with each display template.

The backButton object can have the attribute "HIDDEN" or "VISIBLE". If not included in a template response, then by default the back button will be shown on the screen.

If the customer states "go back", that has the same effect as invoking the AMAZON.Previous intent in your skill.

These two examples show how the backButton object appears in a response that includes a display template.

Example: Back button hidden by display template

Click button to view.

Example: Back button made visible by display template

Click button to view.

Include hint directives in responses

To use a hint directive in a response, you must also include a display template, other than BodyTemplate3 and ListTemplate1, which do not support hints, in your response.

Hints should be used for optional content and to delight customers, and not for important information. If the Hint directive is used in a response, the hint is visible on Echo Show and Fire TV Cube, but not visible on Echo Spot. Thus, every skill that uses hints should be designed so that the hints are optional. A hint can be included on each template, by use of the Hint directive, except for BodyTemplate3 and ListTemplate1.

The Hint directive allows a string value that informs the user what to ask Alexa. When displayed on screen, the hint text appears in the following set form:

*"Try <wake-word>, <hint_String>"*

Thus, if the value is "tell me what movies are playing", and the customer has their wake word set to "Alexa", the hint appears as follows:

*Try "Alexa, tell me what movies are playing"*

For brevity, the following example shows only the Hint directive, but a typical response with a hint would also include a Display.RenderTemplate directive.

{
  "directives": [
    {
      "type": "Hint",
      "hint": {
        "type": "PlainText",
        "text": "string"
      }
    }
  ]
}

textContent object specifications

The textContent object, found in all templates, allows for primaryText, secondaryText, and tertiaryText fields, which may be styled differently. With the ListTemplate1 template, the text is automatically styled to match these hierarchy levels. For the other templates, the text listed for each of primaryText, secondaryText, and tertiaryText is concatenated, with line breaks added between each, and no difference in font between the lines. Each of primaryText, secondaryText, and tertiaryText has the same format, and each is subject to an 8000-character limit.

{
  "textContent": {
    "primaryText": TextField,
    "secondaryText": TextField,
    "tertiaryText": TextField
  }
}

In each case, TextField is represented as follows. If type is set to PlainText, no markup is included. If type is set to RichText, the markup described in Supported Markup can be included.

{
"type": "PlainText"  | "RichText",
"text": "string"
}

PlainText and RichText are the only supported type values.

If type is set to RichText, you can use the supported markup and supported XML characters to change the appearance of the text. See Supported Markup for Text in Display Templates.

In addition, if type is set to RichText, you can use an inline image, with an absolute file path, can be used for the value of "text". The height of the image has no specific restriction. The maximum width is 880px, which accounts for left and right padding.

In the "text" field, you can include text that is wrapped in an action tag, which is then selectable on the screen.

In this example, the word "Cancel" is wrapped in an action tag that gives it the value 'cancel_trip'. When the customer touches the word "Cancel" on the screen, this triggers a Display.ElementSelected event with a token value of 'cancel_trip`. The skill developer can then use this token to map this touch interaction to trigger the appropriate behavior in the skill service code.

<action value='cancel_trip'>Cancel</action>

Example: PlainText instance

  {
    "type": "PlainText",
    "text": "Welcome to My Skill"
  }

Example: RichText instance

{
    "type": "RichText",
    "text": "Welcome to <b>My Skill</b>"
}

Character count maximums for display templates

Each Alexa-enabled device with a screen allows limited text on the screen, depending on the template used, and the font size used. If the included text exceeds these limits, the text is truncated on the screen display. The user cannot scroll to see the remaining text. Ensure that the text you use in the templates does not exceed these limits.

Markup is not included in the maximum character limits. These character limits are based on a font size of 32px, and must be adjusted proportionately if another font size is used. The default font size is 32px.

For each template, the maximum for the title is 200 characters.

Template Main Text Field
ListTemplate1 (vertical text) 84 total
ListTemplate2 (horizontal with text under)) 84 total
BodyTemplate1 (full-width text) 85 total
BodyTemplate2 8000 total
BodyTemplate3 8000 total
BodyTemplate6 85 characters total
BodyTemplate7 No main text field for BodyTemplate7

Font ramp mapping among Alexa-enabled devices with a screen

Font sizes for Alexa-enabled devices with a screen will scale automatically based on the updated font ramp below. Each template has a default font size set for the primary content to maximize legibility. Those default sizes can be overridden using the values below.

Size 3 is the default. Although Fire TV Cube has a larger screen than the other devices listed, Fire TV Cube also has a higher display size per pixel, so the Fire TV Cube font size values are less than for Echo Show.

Display font sizes for Alexa-enabled devices with a screen

Font SizeFire TV CubeEcho ShowEcho Spot
Size 7486848
Size 5324838
Size 3243232
Size 2162828

Supported markup for text in display templates

The following markup elements, as well as XML special characters and Unicode characters, are supported for rich text, but not plain text. The format is always UTF-8. For encoding certain special characters, see Handle XML Special Characters.

Name Element Example Markup Output
Line break <br/> First line<br/>Second line First line
Second line
Bold <b> This is a <b>ladybird</b> beetle This is a ladybird beetle
Italics <i> Scientific name <i>Coccie nellidae</i> Scientific name Coccienellidae
Underline <u> Always <u>feed</u> your ladybird tasty aphids. Always feed your ladybird tasty aphids.
Font Sizes <font size="2"> small (28px) </font>
<font size="3"> medium </font>
<font size="7"> large (68px) </font>

<font size="7">Cake</font> <br> <font size="3">This is the best cake recipe ever. <br>

<font size="2">- Flour</font> <br>

<font size="2">- Sugar</font> <br>

Cake
This is the best cake recipe ever.
- Flour
- Sugar
Action

<action token="VALUE">clickable text </action>

The clickable text can be tapped on the screen.
Learn the <action token="2347"> history </action> of ladybirds. Learn the history of ladybirds.
Inline images

<img src='URL' width='WIDTH' height='HEIGHT' alt='TEXT' />

Image loaded from the Internet. Supports auto-sizing the image to fit line height, alignment to baseline or text bottom.
This is an inline <img src='https://www.example.com/test1.jpg' width='500' height='500' alt='test image' /> image.
Rich Text with Inline Image

Handle XML special characters

Templates and cards differ in how they display special characters. If you want to use the following characters in the content for a display template, escape them as follows:

  • ampersand (&) is escaped to &amp;
  • double quotes (") are escaped to &quot; or \"
  • single quotes (') are escaped to &apos; or \'
  • less than (<) is escaped to &lt;
  • greater than (>) is escaped to &gt;
  • slash (\) is escaped to \\
  • non-breaking space is escaped in XML format as &#160; (do not use HTML format)

Text alignment With rich text

Refer to textContent Object Specifications for general information on how to implement rich text.

Wherever the textContent object is used in a template, center alignment can now be done as follows, if type is set to RichText.

{
    "type" : "RichText",
    "text" : "<div align='center'>This text will align center</div>"
}

Example: Use BodyTemplate3 in a response to create a screen display

The template BodyTemplate3 is a simple body template consisting of fields that you can specify: text, title, token, and image. In this example, the optional backButton, and backgroundImage are not included. The skill icon always appears on the screen at the top right when a display template is rendered onscreen. The skill icon is specified separately when you prepare the skill for launch in the Launch > Store Preview section of the developer console.

To create this display:

  • Include a Display.RenderTemplate directive in your JSON response
  • Set the type to BodyTemplate3
  • Set the title field and textContent object to indicate the text to display
  • Set the image object with URL properties and a contentDescription (for use by screen readers).
  • Set the backButton attribute if you want to toggle between hidden and visible back buttons. Otherwise, by default

Click button to view.

Example: Use ListTemplate1 in a response to create a screen display

List templates contain a scrollable list of items, which can be presented vertically with text only or horizontally with accompanying images on the screen. With ListTemplate1, you get a vertical list of items. This ListTemplate1 template has a single title field (displayed at the top of the screen), a backgroundImage, and a listItems field. Each list item contains optional token, textContent, and image fields. To create this display:

  • Include a Display.RenderTemplate directive in your JSON response.
  • Set the template type to ListTemplate1.
  • Set the title string to the desired value.
  • Set the backgroundImage string to the desired URL.
  • Define each list item.

Click button to view.

Example: Use ListTemplate2 in a response to create a screen display

ListTemplate2 produces a horizontal list. This ListTemplate2 template has a single title field (displayed at the top of the screen) and a listItems field. Each element in the list field contains optional token, textContent, and image fields. To create this display:

  • Include a Display.RenderTemplate directive in your JSON response.
  • Set the template type to ListTemplate1.
  • Set the title string to the desired value.
  • Set backButton values if appropriate.
  • Include the syntax for each list item, as shown.

In this example, the token list_template_two has no effect on the display, but you as the skill developer can use the token for tracking purposes in the skill service code to make the item selectable.

Click button to view.

Handle selection events by voice and touch

Each item in a list can be made selectable by touch. For each selectable element on the screen, the skill developer provides an associated token that they will receive in the callback response when the element is selected. See the list templates in the Display Template Reference. The developer may name the tokens using their preferred methodology.

The skill can set a token field on any selectable element, and this token is returned in a Display.ElementSelected request if that element is selected. An example of such an event is shown below.

 "request": {
    "type": "Display.ElementSelected",
    "requestId": "amzn1.echo-api.request.7zzzzzzzzz",
    "timestamp": "2018-06-06T20:05:04Z",
    "locale": "en-US",
    "token": "getTopicName-Cookie-Contest"
  }

There is no built-in intent for selecting actions or list items. However, you can create an intent for this purpose and include it in the intent schema. This intent should be activated when the skill receives a Display.ElementSelected event in a response.

Design this intent so that if a list template is used, items shown on screen can be selected by the user saying the item name, or by saying the number of the item. The skill developer determines in the service (AWS Lambda or web service) whether a user can select by name or by ordinal.

The skill developer must create an intent, which is forwarded to the skill, to enable customers to vocally select list items and actions. The skill service should define intents for "select", "open", and "show" as well as for "number one", "number two", "one", "two", and so forth.

Each list item is tracked by use of a token in order to facilitate the correct response when a list item is selected by touch.

As the user progresses through a skill, different body and list templates may be used in the course of delivering the skill content. For example, with a recipe skill, the user may navigate from a search screen to selecting a recipe to viewing the ingredients to preparing the recipe, each requiring a separate screen display. The developer must plan this flow carefully.

The current screen display with a specified template remains on screen for a skill if the session is not ended, and no new template has been sent. If a response with a card is sent, the screen display with a template will remain in place. Thus, a skill could progress through multiple turns with the same screen in place, unless the display is purposefully changed with a skill response that includes a different template.

Template and card precedence order for display on screen

For Alexa-enabled devices with screens, the response is parsed for display options. If there are multiple display options, the order of precedence for display is as follows:

  1. Display.RenderTemplate directive. The last-rendered template remains on the screen until another template is sent, or until the skill exits. Thus, the same template will remain on screen for multiple turns unless another template is explicitly sent.

  2. Card. If no template has been sent to the screen, but a card has been sent, this card is displayed on the screen. Cards are rendered on Alexa-enabled devices with a screen using BodyTemplate1.

  3. The default template (BodyTemplate1) is automatically created and displayed if there is no template or card specified in the skill response, and none is currently displayed on screen.

Determine the version of the supported display

To ensure compatibility, the version of the markup and templates that are supported by the current device are sent in the device request. The only version that is currently supported is "1", but providing support for these attributes in your response helps ensure backwards compatibility.

{
  "display": {
    "templateVersion": "1",
    "markupVersion": "1"
  }
}
AttributeDescription
markupVersionVersion of markup.
templateVersionThe version of templates supported by the requesting device.
tokenThe token for the content currently shown on the display.

Format of different GUI responses for devices with and without screens

The simple response shown below supports only a card and speech response, and is what a developer would provide if not specifically supporting screen display. Note that if viewed on an Alexa-enabled device with a screen, this card will be rendered using BodyTemplate1.

{
  "version": "1.0",
  "sessionAttributes": {
    "supportedHoroscopePeriods": {
      "daily": true,
      "weekly": false,
      "monthly": false
    }
  },
  "response": {
    "card": {
      "type": "Simple",
      "title": "Horoscope",
      "content": "You are going to have a good day today."
    }
  },
  "reprompt": {
    "outputSpeech": {
      "type": "PlainText",
      "text": "Anything else?"
    }
  }
}

The following response supports the same card and speech response, but also supports screen display via the use of a display template. In this example, BodyTemplate1 is used as the display template. In this example, the text on screen includes a phrase with bolded emphasis.

{
  "version": "1.0",
  "sessionAttributes": {
    "supportedHoroscopePeriods": {
      "daily": true,
      "weekly": false,
      "monthly": false
    }
  },
  "response": {
    "card": null,
    "outputSpeech": {
      "type": "PlainText",
      "text": "You are going to have a good day today."
    },
    "reprompt": {
      "outputSpeech": {
        "type": "PlainText",
        "text": "Anything else?"
      }
    },
    "directives": [
      {
        "type": "Display.RenderTemplate",
        "template": {
          "type": "BodyTemplate1",
          "token": "horoscope",
          "title": "This is your horoscope",
          "image": {
            "contentDescription": "Aquarius",
            "sources": [
              {
                "url": "https://example.com/resources/card-images/aquarius-symbol.png"
              }
            ]
          },
          "textContent": {
            "primaryText": {
              "type": "RichText",
              "text": "You are going to have a <b>good day</b> today."
            }
          }
        }
      }
    ],
    "shouldEndSession": false
  }
}

Validation rules for responses

Responses for devices with Alexa, including Alexa-enabled devices with screens, should follow these validation rules.

  • At most one Display.RenderTemplate directive can be specified in a response.
  • Do not include both an AudioPlayer.Play directive with long-form audio and a VideoApp.Launch directive together in the same response.
  • All required fields must be provided.
  • No unknown properties may be specified.

If the response is invalid, an error card appears on Alexa-enabled devices with a screen, and an error is sent to the skill.

Use of shouldEndSession atribute and session timeouts

When shouldEndSession is not specified, or has a value of null, and a Display.RenderTemplate directive is active, the session is kept open, and the screen device does not expect a voice response. If the user says the wake word and a command, this utterance is recognized in the context of the skill.

To specifically control shouldEndSession behavior, set this attribute to true to end the session, or to false to continue the session.

If shouldEndSession is set to false and there is no reprompt, the skill will exit the session if there is no activity for 30 seconds.

If there is a reprompt, then the timeout for reprompt is 8 seconds with the microphone open and a blue ring displayed, plus 8 seconds for the customer to respond. If Alexa does not hear a response, then a reprompt is rendered, and the customer is given another 8 seconds to respond. If there is still no response, and shouldEndSession is set to false, the session remains open until the display times out.

Best practices for skill development for Alexa-enabled devices with a screen

See Best Practices for Designing Skills for Alexa-Enabled Devices With a Screen.

Test Your skill with Alexa Simulator

On the Test page in the developer console, you can use the Alexa Simulator to test what templates look like when rendered on Echo Show or Echo Spot, even if you do not have a screen device. Fire TV Cube is not supported for skill simulation.

Ensure that you have all the display options selected to see how the skill looks with these devices.

As noted in the Display Template Reference, different devices render the same templates in different ways. For example, with some templates, foreground images in Fire TV Cube and Echo Show become background images in Echo Spot. Thus, for a good customer experience, ensure that you test as thoroughly as possible.

Alexa Simulator is subject to the following restrictions:

  • Only custom skills are currently supported

  • The display is a close representation of what you see on the screen, but not pixel-perfect

  • The skill is not functional in the simulator, and nothing in the simulator is clickable. The session is not maintained.

  • Alexa Simulator is not supported for use in multiple browser tabs.

Alexa Simulator Includes Screen Support
Alexa Simulator Includes Screen Support

See: Test Your Skill

Service Interface Reference (JSON)

Request Format and Standard Request Types:

Interfaces: