Video Skills Kit (VSK) for Echo Show

Important: At this time, VSK integration for Echo Show is available to select partners only.

Echo Show devices allow users to interact with experiences through voice or by touching the screen. Video skills designed for Echo Show can combine video content with voice interaction (multiple modes). Echo Show is an app-less device. Implement the Video Skills Kit (VSK) to deliver your video content and enable Echo Show users to control videos by touch or by voice commands.

Echo Show Device
Video Skills Kit (VSK)
Comparison between Video Skills and Custom Skills on Screen Displays
Delivery of Video Content on Echo Show
Catalog Integration
Getting Started Building your Video Skill for Echo Show
High-level Workflow
Next Steps

Echo Show Device

Echo Show is an "always on" and app-less device, leveraging cloud-based skills and generic on-device components.

A common scenario for using video skills on Echo Show might be a user, cooking in the kitchen, who says "Alexa, play Bosch" to her device.

Sample scenario for using a multimodal device — Sample scenario for using VSK for Echo Show

Amazon offers these Echo Show devices:

Echo Show (1st and 2nd Generation)
Echo Show 5
Echo Show 8
Echo Show 10

Echo Show (2^nd Gen) launched in October 2018 and has a 10.1-inch HD screen. Echo Show 5 launched in June 2019 and has a more compact 5.5-inch screen. Echo Show 8 launched in November 2019 and has a 8.0-inch screen. Currently, Echo Spot does not support VSK integration.

Video Skills Kit (VSK)

The Video Skills Kit (VSK) is a set of APIs that allow users to interact with video content by using Alexa. VSK provides a way to develop video skills without having to design new mechanisms for enabling user interaction with video content. While traditional custom skill development requires designing wording and control flows, VSK provides a specialized set of tools to support video control mechanisms on Echo Show and Fire TV.

VSK exposes a variety of functionality, such as:

playback control
catalog searching
channel surfing

Comparison between Video Skills and Custom Skills on Screen Displays

The Video Skill API is intended for video providers or device manufacturers making their devices voice interactive. The implementation involves handling of Alexa directives, such as "Alexa, play Interstellar", by using AWS Lambda and your own video content.

In contrast, for supporting visuals in your Alexa skill (images, short video clips, plain text, and so on), you should create a custom skill (rather than a video skill) through display templates and by using the Alexa Presentation Language (APL). For example, you might want to show text or images related to a quiz skill on an Echo Show screen, instead of a more involved interactive voice experience with your video content by leveraging the Video Skill API. In such cases, see Create Skills for Alexa-Enabled Devices with a Screen.

Delivery of Video Content on Echo Show

Video skills use (AWS Lambda) and a web app player. The Lambda requirement is driven by Alexa Skills architecture whereas, the VSK on Echo Show architecture is driving the app-less requirement of a web player. At a high level, here's how it works:

Customers authenticate with your video skill through Account Linking, for example, using OAuth.
VSK on Echo Show provides templates for displaying features such as Search and Browse. Devices render the templates with your video content.
Playback takes place using a custom web player that you own.

Catalog Integration

Catalog integration is the process of describing your media according to an XML schema, namely the Catalog Data Format (CDF), and regularly uploading your catalog into an S3 bucket, following the processes in catalog documentation.

Catalog integration is currently limited to long-form movies and episodic TV shows, so please reach out to your Amazon Business contact for guidance on that matter. You cannot fully implement the VSK for Echo Show unless you qualify for catalog integration. If you haven't integrated your catalog yet, perform the steps in catalog integration before designing your own video skill on Echo Show devices.

Note: The reference video skill contains a catalog that you can use temporarily for a quick start.

Getting Started Building your Video Skill for Echo Show

To install, build, and deploy the reference video skill targeting Echo Show devices, you need the following:

An Echo Show device for testing
Alexa Developer Account
AWS Developer Account
AWS Lambda - provided by the reference video skill
Web player optimized for the device - provided by the reference video skill
Account Linking to view your content
Catalog-integrated media and catalog name

Note: Reach out to your Amazon Business contact for guidance on your catalog's name.

Additionally, to create your own video skills, you also need a Logo image and a Background image.

For a quick start, it is highly recommended to first install, build, and deploy the Reference Video Skill by using the Automated Infrastructure CLI Tool. This tool speeds up the process of setting up a video skill by using your computer's terminal or PowerShell. The reference video skill also provides you with a web player, and access to a catalog to test on device.

You might also want to implement other backend services such as content metadata retrieval, category lookup, and several forms of search.

Important: These backend services are not the Lambda. Lambda serves the Echo Show device and the web player.

Also, note that even if you have previously created a video skill and a Lambda function for VSK into a Fire TV app, you must create a new, separate video skill and a new, separate Lambda function for VSK on Echo Show devices.

High-level Workflow

To integrate the VSK for Echo Show, you first create a video skill in the Alexa Developer Console and associate it with a Lambda function on AWS. When users interact with your skill through voice, Alexa Voice Services (AVS) convert their commands into JSON objects, called directives.

Alexa sends these directives to your lambda function. Your lambda function handles the requests and, usually, interacts with your backend service to retrieve the needed information (by performing lookups, queries, and so on). This information might be the URI for the requested content, or available titles matching the request. Lambda retrieves this information and sends it back to Alexa.

Next Steps

See Features of VSK for Echo Show.

Last updated: Mar 05, 2021