Understand Video Skill Integration for Echo Show

Follow this step-by-step guide to enable an existing video skill to stream content to Echo Show. To learn the basics of building video skills before building one for Echo Show, see Understand the Video Skill API.

The benefits of the Echo Show implementation over a traditional video skill implementation include top-level utterance support for natural invocation, more precise content selection, and a better overall user experience on Echo Show. For the full collection of video features, see the Echo Show Video Skill Feature Guide.

Architecture overview

The traditional video skill architecture uses the Alexa.RemoteVideoPlayer API to send commands through the video provider’s cloud service to a video client. On Echo Show, the API sends commands through a channel established by Alexa, which allows for a new interaction model using the same API. The API continues to send search directives and play directives to the AWS Lambda function for the skill. However, unlike the traditional video skill architecture where directives must be sent from the Lambda to a separate device by the developer, results for video skills on Echo Show are sent back directly to Alexa to drive the experience on the device.

Echo Show API Architecture

Primary components

AWS Lambda

The AWS Lambda function configured in your skill definition is the interface between Alexa and your backend services. To support streaming content to Alexa endpoints such as Echo Show, you need to implement a separate set of APIs in your AWS Lambda function.

Backend services

The experience you intend to deliver determines the breadth and depth of required supporting services. Common backend services that accompany a video skill include content metadata retrieval, category lookup, and several forms of search.

Web player

When a user plays content from your service, a URL that you provide in your skill definition opens your Echo Show-optimized web player. The web player receives voice commands through a JavaScript library provided by Alexa and included in the HTML for your web player. AWS Lambda and the JavaScript library pass playback lifecycle events and the metadata of the current content to Alexa to provide users with the full set of video features.

The web player opens in a web browser that supports the following codecs, formats, and standards:

Video:

  • HLS/MPEG-DASH

  • MP4 H.264

  • Widevine DRM

  • Encrypted Media Extensions (EME)

  • Media Source Extensions (MSE)

Audio:

  • MP4 with AAC

  • WebM with Vorbis

  • WebM with Opus

Steps to Create a Video Skill Web Player

Step 1: Build your web player

Complete the following high-level tasks to build a video skill web player.

  1. Migrate or build your web player

    If you have existing web assets that deliver your content, isolate the web player component and supporting functionality such as metrics reporting, advertisement logic, and other dependencies. Video must play exclusively in full-screen mode and links to any other website are restricted.

  2. Style your web player

    To apply the visual controls and non-video elements of your player, see Echo Show Video Skill Certification Guidelines for certification guidelines related to the video experience.

  3. Include the Alexa Video JavaScript library

    The alexa-web-player-controller JavaScript library provides a communication bridge between Alexa and your web player. Load the JavaScript library with the following script tag in your HTML:

    <script type="text/javascript" src="https://dmx0zb087qvjm.cloudfront.net/alexa-web-player-controller.0.1.min.js"></script>
    
  4. Initialize the Alexa Video JavaScript library

    In your web app’s initialization code, wait for the Alexa object to be ready, then initialize the alexa-web-player-controller library with your readyCallback and errorCallback. The readyCallback is invoked when the library is ready for execution. In the readyCallback, you receive a controller object to communicate with the library, which includes all of the control methods. The errorCallback is invoked if there is an error that causes the web player to close. The user receives an error message that contains a human-readable reason for the error. The alexa-web-player-controller library closes the web container after sending the error message. During this initialization, you may also optionally register event handlers by adding an additional argument containing a map of callback functions keyed by event names. Otherwise you can register handlers using controller.on(handlers) / controller.on(event, handler) method explained in the next section. The handlers play, pause, and resume are required.

    // Load Alexa Video JavaScript library before this script.
    AlexaWebPlayerController.initialize(readyCallback, errorCallback);
    

    or

    AlexaWebPlayerController.initialize(readyCallback, errorCallback, handlers);
    
  5. Register event handlers

    Register handlers using the controller.on(handlers)/controller.on(event, handler) method. The handlers play, pause, and resume are required.

    function readyCallback(controller) {
        var Event = AlexaWebPlayerController.Event;
        var handlers = {
            Event.LOAD_CONTENT: function handleLoad(params) {},
            Event.PAUSE: function handlePause() {},
            Event.RESUME: function handleResume() {},
            Event.SET_SEEK_POSITION: function handleSetPos (positionInMilliseconds) {},
            Event.ADJUST_SEEK_POSITION: function handleAdjustPos(offsetInMilliseconds) {},
            Event.NEXT: function handleNext() {},
            Event.PREVIOUS: function handlePrevious() {},
            Event.CLOSED_CAPTIONS_STATE_CHANGE: function handleCCState(state) {},
            Event.PREPARE_FOR_CLOSE: function handlePrepareForClose() {},
            Event.ACCESS_TOKEN_CHANGE: function handleAccessToken(accessToken) {}
        };
        controller.on(handlers);
    }
    

    Handlers

    Key Description Handler Parameters Type Description
    LOAD_CONTENT Load a given piece of content. params.contentUri string URI of content provided in Video Skill API response.
    params.accessToken string User credentials.
    params.offsetInMilliseconds integer Offset from beginning of content to begin playback.
    params.autoPlay boolean Flag to automatically play content after loading.
    PAUSE Pause playback. none
    RESUME Resume playback. none
    SET_SEEK_POSITION Change playback position to an absolute position within the content. positionInMilliseconds integer

    If non-negative, go to the beginning.

    If over the range, go to the end.

    ADJUST_SEEK_POSITION Change playback position by an offset from the current position. offsetInMilliseconds integer

    If positive, offset from start.

    If negative, offset from end.

    NEXT Advance to the next video, if available. none
    PREVIOUS Go back to the previous video, if available. none
    CLOSED_CAPTIONS_ STATE_CHANGE Update closed captions state. state object Closed captions state including enabled, text, background, window background.
    PREPARE_FOR_CLOSE Prepare for the device to close the web container in 250 ms, and handle any remaining actions. none
    ACCESS_TOKEN_CHANGE Update access token. accessToken string User credentials.
  6. Implement handlers

    How you handle commands depends on the implementation of your player. All handlers must return a Promise object to be resolved on successful handling of the command or rejected with an error object with errorType and message if there is a failure. If a failure occurs for PLAY, PAUSE or RESUME, the device calls PREPARE_FOR_CLOSE and tries to close the web player.

    For the LOAD_CONTENT operation, your handler receives parameters including contentUri sent in your AWS Lambda function response representing the content to be played. If your service requires the user be authenticated to stream content, an accessToken is included as well. Additional details may also be included, such as offsetInMilliseconds if the user is picking up in the middle of content, or autoplay if the content needs to auto play after loading. The handler returns a Promise object to be resolved on successful handling of the play command or rejected with an error object with errorType and message if there is a failure.

  7. Hide loading overlay

    Your player is not visible until controller.showLoadingOverlay is called with a value of false. Make the call after loading your assets are loaded and the experience is ready to show. You must call this method to disable the overlay when the content is loaded and UI presentable. The loading overlay is always shown during initialization. Subsequent calls to load content (such as calls for new, unrelated content) allow you to turn on the loading overlay and turn it off appropriately, but is not required. If you want a custom loading screen, you can disable this before content is ready and provide your own visualizations. This must not be called when content is playing.

    controller.showLoadingOverlay(false);
    
  8. Send playback lifecycle events to Alexa

    When your player changes state, use the controller.setPlayerState(playerState) method to pass along the lifecycle events to Alexa. The playerState includes two properties, State and positionInMilliseconds. Update the playerState when both the State and the positionInMilliseconds change.

    controller.setPlayerState({
        state: AlexaWebPlayerController.State.IDLE,
        positionInMilliseconds: 0
    });
    

    The following table lists the player states.

    State Description
    IDLE Player is idle; no content is loaded or playing. Player is ready to stream content.
    BUFFERING Playback is suspended due to content buffering.
    PLAYING Player is actively streaming content.
    PAUSED Playback is suspended during content.
  9. Configure allowed operations

    When the allowed operations for Alexa change, set the allowed operations by using controller.setAllowedOperations(allowedOperatons) through the JavaScript library. Allowed operations require handlers, which are not pre-implemented. By default, an operation is not allowed until the handler has been implemented and the operation is set to true in allowedOperations.

    controller.setAllowedOperations({
        adjustRelativeSeekPositionForward: true,
        adjustRelativeSeekPositionBackwards: true,
        setAbsoluteSeekPositionForward: true,
        setAbsoluteSeekPositionBackwards: true,
        next: true,
        previous: true,
    });
    

    The following table lists the allowed operations.

    Name Type Prerequisite Handler Description
    allowedOperations object N/A Allowed operations for content currently in the player.
    adjustRelativeSeekPositionForward boolean adjustSeekPosition If true, allow user to seek forward relative to the current position.
    adjustRelativeSeekPositionBackwards boolean adjustSeekPosition If true, allow user to seek backwards relative to the current position.
    setAbsoluteSeekPositionForward boolean setSeekPosition If true, allow user to seek forward to an absolute position.
    setAbsoluteSeekPositionBackwards boolean setSeekPosition If true, allow user to seek backwards to an absolute position.
    next boolean next If true, allow user to request the next content in the play queue.
    previous boolean previous If true, allow the user to request the previous content in the play queue.
  10. Send content metadata to Alexa
    Pass along metadata for the current content in the controller.setMetadata(metadata) method. Do this for both the initial content and any new content played thereafter.

    controller.setMetadata({
        type: AlexaWebPlayerController.ContentType.TV_SERIES_EPISODE,
        value: {
            name: "name",
            closedCaptions: {
                available: true
            },
            durationInMilliseconds: 1000,
            series: {
                name: "name",
                seasonNumber: 1
            },
            episode: {
                number: 1,
                name: "name"
            }
        }
    });
    

    Metadata for other video:

    controller.setMetadata({
        type: AlexaWebPlayerController.ContentType.VIDEO,
        value: {
            name: "",
            closedCaptions: {
                 available: true
            },
            durationInMilliseconds: 1000,
        }
    });
    
    Name Description Required Type Values    
    type Content type of the metadata. Y String AlexaWebPlayerController.ContentType    
    value Value of the metadata. Each type may have a different set of values. Y Object JSON object    
    name Name of the video. Y String Example: Interstellar    
    closedCaptions Closed captions of the video. Y Object JSON object    
    available Availability of the closed captions. Y Boolean True/False    
    durationInMilliseconds Duration of the video in milliseconds. N Number Example: 3141343    
    series Metadata of a series. N Object JSON Object    
    name Name of a series. N String Example: "Survivor: Borneo"    
    seasonNumber Number of the season. N String Example: 1    
    episode Metadata of an episode. N Object JSON Object    
    name Name of the episode. N String Example: The Marooning    
    number Number of the episode. N String Example: 1    
    AlexaWebPlayerController.ContentType Value(string) Description
    TV_SERIES_EPISODE TV_SERIES_EPISODE Content type for TV series episode.
    VIDEO VIDEO Content type for video.
  11. Get/Set the state of closed captions

    Retrieve the device-level setting for closed captions at the beginning of playback by using the controller.getClosedCaptionsState() method. To toggle closed captions on and off, use the controller.setClosedCaptionsStateEnabled(enabled) method.

    controller.getClosedCaptionsState();
    
    controller.setClosedCaptionsStateEnabled(isEnabled: boolean);
    

    Closed captions state:

    {
        enabled: ,			
        text: {
            size: ,
            color: ,
            opacity: ,
            font: ,
            edge: ,
        },
        background: {
            color: ,
            opacity: ,
        },
        windowBackground: {
            color: ,
            opacity: ,
        }
    }
    
    Name Description Type Values
    enabled Whether the closed captions are enabled. Boolean true/false
    text Text preference. Object N/A
    size Size of the text. Number

    Text Size (in pixels)

    Example: 10

    color Color of the text. String

    Text Color (RGB value in HEX code):

    Example: #ff0000

    opacity Color opacity of the text. Number

    Text Color Opacity (alpha value between 0 - 1.0)

    Example: 1.0

    font Font of the text. String

    Font (from Google Fonts)

    1. Default (Selected by caption author)

    2. Casual ( = "ComingSoon")

    3. Cursive ( = "DancingScript-Regular")

    4. Monospace Sans ( = "DroidSansMono")

    5. Monospace Serif ( = "CutiveMono")

    6. Sans Serif ( = "Roboto-Regular")

    7. Serif ( = "NotoSerif-Regular")

    8. Small Capitals (= "CarroisGothicSC-Regular")

    edge Edge style of the text. Number

    Edge Style:

    1. None ( = 0, no character edges)

    2. Uniform ( = 1, uniformly outlined character edges)

    3. Drop Shadowed ( = 2, drop-shadowed character edges)

    4. Raised ( = 3, raised bevel character edges)

    5. Depressed ( = 4, depressed bevel character edges)

    background Text background preference. Object N/A
    color Color of the text background. String

    Text Background color(This is the RGB value of the color)

    Example: #ff0000

    opacity Opacity of the text background. Number

    Text Background Opacity* Percentage (disabled when Text Background Color set to default; alpha value between 0 - 1.0)

    Example: 1.0

    windowBackground Window background preferences. Object N/A
    color Color of the closed captions window background. String

    Window Background color (RGB value)

    Example: #ff0000

    opacity Opacity of the closed captions window background. Number

    Window Background Opacity (disabled when Window Background Color set to default; alpha value between 0 - 1.0)

    Example: 1.0

  12. Report fatal errors

    If your player encounters an error and is unable to play content, use the controller.sendError(error) method to send a fatal error to Alexa. Alexa hides the web app, call the PREPARE_FOR_CLOSE handler and close the web player. For non-fatal errors, there is no need to send back to Alexa.

    controller.sendError({
        type: AlexaWebPlayerController.ErrorType.PLAYER_ERROR,
        message: 'Error message as string'
    });
    

    Error types

    Type Description
    PLAYER_ERROR Send PLAYER_ERROR event when there is an unrecoverable error having to do with the media player.
    CLIENT_ERROR For any client side error not related to the player, send an CLIENT_ERROR.
    SERVER_ERROR SERVER_ERROR indicates an error occurred in server side including failed requests, unable to buffer content, unreachable assets, and any connectivity issues.
  13. Ending experience

    When you think the play session has ended, call controller.close() method to inform Alexa. Alexa then hides the web app, calls the PREPARE_FOR_CLOSE handler, and ends the experience.

    controller.close();
    
  14. Host the website on a publicly accessible URL

    Finally, make your player available on a public URL. Be sure your web player can be accessed by using HTTPS for compatibility and security.

    At this point, your web player is almost ready to be tested on an Echo Show. But first, you must implement the required Video Skill APIs in your skill’s Lambda function as described in the following section.

Step 2: Implement Video Skill APIs in the AWS Lambda

Several new APIs have been added to the set of video skill interfaces to drive the Echo Show experience. To get started quickly, try using the code in the accompanying Alexa Video Skills for Echo Show - Sample Lambda.js file in your AWS Lambda. The following steps list a summary of the APIs available to your AWS Lambda. A more comprehensive reference for this content is available in Echo Show Video Skill API Reference.

  1. Discovery

    Through the discovery process, your skill is able to declare its capabilities to Alexa. These capabilities may be static for all users of your skill or change based on the user’s subscription level with your service. Either way, your response to this API call establishes what a user can do with your skill. To learn more about how to post Discovery for video skills on Echo Show, please reach out to your Amazon Alexa contact.

  2. Search

    Similar to the existing Alexa.RemoteVideoPlayer.SearchAndDisplayResults API, the new Alexa.VideoContentProvider.GetDisplayableItems API has an almost identical request but expects a collection of items to be returned in the response. This response should contain identifiers for your content. A separate call, detailed in the following section, is made to retrieve metadata. This request is separated into two API calls to optimize latency.

    Alexa.VideoContentProvider.GetDisplayableItems
    The following example is a search request for a query such as "Alexa, show me <Title of Video Content>."

    {
        "directive": {
            "header": {
                ...
                "name": "GetDisplayableItems",
                "namespace": "Alexa.VideoContentProvider",
                ...
            },
            "endpoint": {
                "scope": {
                    "type": "BearerToken",
                    "token": "accessToken",
    
                }
            },
            "payload": {
                "entities": [{
                    "type": "Video",
                    "value": "Title of Video Content",
                    "externalIds": {
                        "yourCatalogKey": "yourContentIdFromCatalog"
                    }
                }],
                "locale": "en-US",
                "minResultLimit": 8,
                "maxResultLimit": 25,
                "timeWindow": {
                    "end": "2016-09-07T23:59:00+00:00",
                    "start": "2016-09-01T00:00:00+00:00"
                }           
            }
        }
    }
    

    The following example is a search response.

    {
        "event": {
            "header": {
                ...
                "name": "GetDisplayableItemsResponse",
                "namespace": "Alexa.VideoContentProvider",
                ...
            },
            "payload": {
                "nextToken": "wwrfwef3",
                "mediaItems": [
                    {
                        "mediaIdentifier": {
                            "id": "video://content.1234567"
                        }
                    },
                    {
                        "mediaIdentifier": {
                            "id": "video://content.4567890"
                        }
                    }
                ]
            }
        }
    }
    

    Alexa.VideoContentProvider.GetDisplayableItemsMetadata

    The following example is a request to resolve metadata for the results returned in the previous example.

    {
        "directive": {
            "header": {
                ...
                "name": "GetDisplayableItemsMetadata",
                "namespace": "Alexa.VideoContentProvider",
                ...
            },
            "endpoint": {
                "scope": {
                    "type": "BearerToken",
                    "token": "accessToken",
    
                }
            },
            "payload": {
                "mediaIdentifiers": [
                    {
                        "id": "video://content.1234567"
                    },
                    {
                        "id": "video://content.4567890"
                    }
                ]
            }
        }
    }
    

    The following example is a response providing metadata required for search results.

    {
        "event": {
            "header": {
                ...
                "name": "GetDisplayableItemsMetadataResponse",
                "namespace": "Alexa.VideoContentProvider",
                ...
            },
            "payload": {
                "resultsTitle": "Search Results",
                "searchResults": [
                    {
                        "title": "Video Name",
                        "contentType": "ON_DEMAND",
                        "thumbnailImage": {
                            "small": "https://.../image.jpg",
                            "medium": "https: //.../image.jpg",
                            "large": "https: //.../image.jpg"
                        },
                        /* ... rest of video 1 metadata ... */
                    },  
                    {
                        /* ... video 2 metadata ... */
                    }
                ]
            }
        }
    }
    
  3. Provider Landing Page

    When a user opens the provider landing page, Alexa sends two GetDisplayableItems API calls. The first is to retrieve categories for the landing page by setting the itemType property to "CATEGORY" and including a sortType of "RECOMMENDED." The second call obtains the featured video for the landing page where the request sets to the itemType to "VIDEO" instead. After that, Alexa sends one GetDisplayableItemsMetadata call with a list combined of Category IDs and Video ID. The response includes metadata about the video as well as the categories. For the required fields for each item type, see Echo Show Video Skill API Reference.

    The following example is metadata for a CATEGORY item type.

    {
        "name": "My Watchlist",
        "contentType": "ON_DEMAND",
        "itemType": "CATEGORY",
        "selectionAction": "BROWSE",
        "thumbnailImage": {
            "contentDescription": "My Watchlist Image",
            "sources": [{
                "url": "http://ecx.images-amazon.com/AJhF52zkD7ObETpyTTW.jpg",
                "size": "SMALL",
                "widthPixels": 720,
                "heightPixels": 480
            }]
        },
        "mediaIdentifier": {
            "id": "entity://provider/category/myWatchList"
        }
    }
    
  4. Play

    Play scenarios use the GetPlayableItems and GetPlayableItemsMetadata APIs that are almost identical to what is described previously. The only difference is that the metadata returned for playback varies slightly from that which is used for search. For more information, see Echo Show Video Skill API Reference.

  5. Channel Navigation

    Channel Navigation scenarios also use the GetPlayableItems and GetPlayableItemsMetadata APIs, which are almost identical to what is described previously. The only difference is that the metadata returned for channel navigation varies slightly from that which is used for search/play. For more information, see Echo Show Video Skill API Reference.

Step 3: Implement Account Linking (if applicable)

Account linking allows your users to connect their Alexa identity to their account within your service. If your service requires a user to authenticate in order to access content, then account linking provides this connection. Once a user enables your skill and completes the account linking process, subsequent directives sent to your AWS Lambda include the access_token for the user so you may perform additional authentication, authorization, and personalization.

For a full description of account linking, including implementation details, see Link an Alexa User with a User in Your System.

Step 4: Update Skill Definition

Finally, you must update your skill definition to enable streaming on Echo Show through your web player. If you are already working with someone at Amazon for your integration, you should reach out to them at this time to perform this skill update. Such updates are not yet available directly on the developer portal. To be added to the waitlist to try out video skills for Echo Show, send an email to video-skills-on-echo-show-waitlist@amazon.com.