Best Practices for Designing Skills With Display Content

When you develop a skill for Alexa-enabled devices with a screen–including Fire TV Cube, Echo Show, Echo Spot, Fire HD8, and Fire HD 10–keep the following best practices, design guidelines, and suggestions in mind.

 

See also:

Start Time for Skill

A skill with screen display should start within 2 seconds of launch. Slow-loading images, in particular, may slow down the skill.

If you want to load an external image as part of the skill, you can design the skill to start without that image to ensure you meet the time recommendation.

Alexa Skills With and Without Interface Support

When you develop a skill, you can choose whether or not to specifically support a particular interface, such as Display for screen display. If the user of your skill has an Echo screen device, you naturally want the experience to be as good as if the skill user is using a device, such as Amazon Echo, without a screen. Thus, even if the screen experience is not the focus of your skill, you should still consider it.

Even if you do not take any steps to support screen display, the cards you provide for the Alexa app will be displayed on the screen device, if that is what the customer is using. Thus, you should be aware of how screen display is supported. Note that cards designed for non-screen skills are displayed on supported Alexa-enabled devices with a screen with the BodyTemplate1 template.

If you want to take full advantage of the options provided by a screen, such as the ability to select a particular image from a list or play a video, then you must specifically support screen display in your code.

See:

Design a Skill for Both Screen and Non-Screen Modes

As discussed in the next section, the skill service can determine the supported interfaces for the device, and thus whether the customer's device has a screen display, although the skill service cannot distinguish between different screen devices. For the best customer experience, the skill developer should create a conditional workflow when they create the skill, so that customers who use "headless" devices like Amazon Echo can have an optimized experience, and customers who use an Alexa-enabled device with a screen can also have an optimized experience. For Fire HD 8 and Fire HD 10, the customer can toggle between these modes.

If you want to enhance an existing skill by including visual and touch interactions, take the opportunity to rethink the workflow of the skill. In general, the customer will respond to a skill using different responses and different actions depending on whether the customer does or does not see a screen while using that skill. Your skill service code should reflect this difference and should reflect both types of interactions.

Determine the Supported Interfaces for the Current Device

The skill service should parse the request that comes from the device in order to determine the interfaces supported by the device. The values for event.context.System.device.supportedInterfaces.Display indicate the supported interfaces. In the following example, parsing this JSON-formatted sample request indicates that supportedInterfaces includes AudioPlayer, Display, and VideoApp. If any of these is not listed as a SupportedInterface, that means that the unlisted interface is not supported by the device.

Your skill service code should respond conditionally both to the case where these interfaces are not supported, such as Display.RenderTemplate for an Amazon Echo device, and for the case where these interfaces are supported, such as Display.RenderTemplate for an Alexa-enabled device with a screen.

{
  "version": "1.0",
  "session": {
    "new": false,
    "sessionId": "amzn1.echo-api.session.<value>",
    "application": {
      "applicationId": "amzn1.ask.skill.<value>"
    },
    "attributes": {
      "previousPage": "the proposal"
    },
    "user": {
      "userId": "amzn1.ask.account.<value>"
    }
  },
  "context": {
    "System": {
      "application": {
        "applicationId": "<value>"
      },
      "user": {
        "userId": "amzn1.ask.account.<value>"
      },
      "device": {
        "supportedInterfaces": {
          "Display": {},
          "AudioPlayer": {},
          "VideoApp": {}
        }
      }
    }
  },
  "request": {
    "type": "IntentRequest",
    "requestId": "amzn1.echo-api.request.<value>",
    "timestamp": "2017-06-10T11:03:15Z",
    "locale": "en-US",
    "intent": {
      "name": "AMAZON.StopIntent"
    }
  }
}

In this example, the actual request will have actual values for sessionId, applicationId, userId, and requestId. Your skill service code should respond conditionally both to the case where these interfaces are not supported, such as for an Amazon Echo device, and for the case where these interfaces are supported, such as for an Alexa-enabled device with a screen.

Design Guidelines for Display Template Usage

Ensure your design adheres to these guidelines so your skill will display properly on any Alexa-enabled device with a screen.

  • Avoid using line breaks to vertically align text, as these will not work properly across screens of different sizes.
  • Use font size overrides sparingly. Default font sizes have been set for all templates to allow for maximum legibility at the recommended distances.
  • Use markup such as bold, italic, and underline in meaningful ways to enhance the way your content displays on a device.
  • Action links should not be underlined and must be accessible by voice.
  • Use the new text alignment attributes to selectively align important text. Keep in mind that modifying the alignment will change the appearance on each device type. See Text Alignment With Rich Text.

Use of Images in Display Templates

Do not use background images to display foreground content, even if the results look good on Echo Show or Fire TV Cube. These background images will be cropped and scaled on Echo Spot, sometimes in unpredictable ways. Use background images as wallpaper, to add some delight to your templates, but not to convey significant information to the customer. If your image conveys significant information to the customer, ensure that you use a foreground image with BodyTemplate7.

Images should not be sized specifically for a specific device, as that means they may not display correctly on other devices.

For background images, a 70% opacity black layer should be applied for optimal contrast between the image and text.

Template Usage

Follow these guidelines to ensure your skill will display correctly on all supported Alexa-enabled devices with a screen.

  • Design once, using the templates as they are intended, so that the content will render correctly on any Alexa-enabled device with a screen.
  • Do not nest action links within list items, as these will be difficult to select by voice. This type of nesting will cause the touch and selection to have unpredictable results as a customer cannot drill down within a list item on Echo Spot.
  • Use the header text and hint directives to deliver textual content as appropriate, instead of relying on background images that contain text.
  • Ensure the voice user interface is consistent for all device experiences. Do not try to optimize VUI for a specific device, or include text or spoken instructions such as “touch the screen” as this may cause additional usability issues.
  • For full-screen images with foreground content, use the new BodyTemplate7. This template will only support images and should be used to load image content in the foreground, with a separate background image (if desired).