Create Video Skill and Lambda
Process Overview for Creating Video Skills for Multimodal Devices
Follow this step-by-step guide to enable an existing video skill to stream content to a multimodal device.
The benefits of the multimodal implementation over a traditional video skill implementation include top-level utterance support for natural invocation, more precise content selection, and a better overall user experience the multimodal device. For the full collection of video features, see the Introduction.
The traditional video skill architecture uses the Alexa.RemoteVideoPlayer API to send commands through the video provider's cloud service to a video client. The API sends commands through a channel established by Alexa, which allows for a new interaction model using the same API. The API continues to send search directives and play directives to the AWS Lambda function for the skill. However, unlike the traditional video skill architecture where directives must be sent from the Lambda to a separate device by the developer, results for video skills on multimodal devices are sent back directly to Alexa to drive the experience on the device.
In simpler form, an end-to-end interaction for a video skill on a multimodal device includes the following sequence:
- Customer speaks into multimodal device
- Structured directives arrive in AWS lambda
- Lambda sends structured responses back to Alexa
- Alexa displays search results on screen (if needed)
- Alexa communicates with web player to render playback on screen (if needed)
The AWS Lambda function configured in your skill definition is the interface between Alexa and your backend services. To support streaming content to Alexa endpoints such as a multimodal device, you need to implement a separate set of APIs in your AWS Lambda function.
The experience you intend to deliver determines the breadth and depth of required supporting services. Common backend services that accompany a video skill include content metadata retrieval, category lookup, and several forms of search.
The web player opens in a web browser that supports the following codecs, formats, and standards:
- MP4 H.264
- Widevine DRM Level 1
- Encrypted Media Extensions (EME)
- Media Source Extensions (MSE)
- MP4 with AAC
- WebM with Vorbis
- WebM with Opus
Steps to Create a Video Skill Web Player
At a high level, you will complete the following steps to integrate your content with a video skill for a multimodal device:
- Register and configure a video skill
- Develop an AWS lambda
- Implement Video Skill API responses in AWS Lambda implement-responses-in-lambda
- Develop a web player
- Implement event handlers in the web player
- Implement account linking (or use Login With Amazon)
- Enable skill via Alexa App, and perform an end-to-end test
The documentation breaks out these steps in the following topics: