Step 3: Understand the Alexa Directives and Lambda Responses (VSK Echo Show)

Integrate the VSK into a Multimodal Device

STEP 1:
Create Video Skill and Lambda

→

STEP 2:
Enable Skill on Device and Test

→

STEP 3:
Understand Directives and Responses

→

STEP 4:
Understand Web Player & Playback URL

→

STEP 5:
Build Your Web Player

→

STEP 6:
Respond to Alexa Directives

→

STEP 7:
Implement Account Linking

→

STEP 8:
Test for Certification

In the previous section, Step 2: Enable your Video Skill on a Multimodal Device and Test, you initiated a request from Alexa cloud to your Lambda and observed the requests and responses logged in CloudWatch. Now that you're looking at logs in CloudWatch, observing directives Alexa sends and the responses provided by Lambda, let's unpack this code and explain what's going on in greater detail.

Basic Workflow
An Analogy of the Interaction
Stepping Section by Section through the Sample Lambda
Next Steps

Basic Workflow

The workflow (described in Architecture Overview) begins when a user says a phrase, such as "Play [media title]," to the multimodal device. Alexa's natural language processing intelligence does the work of parsing the user's utterances and figuring out the user's intent. Alexa then packages up the user's intent into a directive. The following directives are used with multimodal device interactions:

GetPlayableItems
GetPlayableItemsMetadata
GetBrowseNodeItems
GetDisplayableItems
GetDisplayableItemsMetadata
GetBrowseNodeItems
GetNextPage

Different scenarios (search, play, channel change, etc.) prompt different directives to be sent. These scenarios and the sent directives are defined in Directives Alexa Sends and in the reference documentation.

When users say, "Play the movie Manchester by the Sea", Alexa identifies all catalogs that contain the media, and then sends a GetPlayableItems directive to the appropriate Lambda functions for those catalogs. The GetPlayableItems directive looks like this:

Alexa Request: GetPlayableItems

{
    "directive": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "9f4803ec-4c94-4fdf-89c2-d502d5e52bb4",
            "name": "GetPlayableItems",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "endpoint": {
            "scope": {
                "type": "BearerToken",
                "token": "access-token-from-skill"
            },
            "endpointId": "videoDevice-001",
            "cookie": {

            }
        },
        "payload": {
            "entities": [
                {
                    "type": "Video",
                    "value": "Manchester by the Sea",
                    "externalIds": {
                        "imdb": "tt4574334"
                    }
                }
            ],
            "contentType": "RECORDING",
            "locale": "en-US",
            "minResultLimit": 8,
            "maxResultLimit": 25,
            "timeWindow": {
                "end": "2016-09-07T23:59:00+00:00",
                "start": "2016-09-01T00:00:00+00:00"
            }
        }
    }
}

The directive name appears in the header block. You can see that this is a GetPlayableItems directive.

Your Lambda receives this directive as an event. Your Lambda code needs to perform whatever lookups are necessary to identify what media titles match that request (based on the payload in the directive). As part of the lookup, your Lambda might identify additional media titles relevant to the user's request.

Then your Lambda returns a response to Alexa that conforms with the requirements for responses for that directive type. For GetPlayableItems directives, the GetPlayableItemsResponse looks like this:

Lambda Response: GetPlayableItemsResponse

{
    "event": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "5f0a0546-caad-416f-a617-80cf083a05cd",
            "name": "GetPlayableItemsResponse",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "payload": {
            "nextToken": "fvkjbr20dvjbkwOpqStr",
            "mediaItems": [
                {
                    "mediaIdentifier": {
                        "id": "recordingId://provider1.dvr.rp.1234-2345-63434-asdf"
                    }
                },
                {
                    "mediaIdentifier": {
                        "id": "recordingId://provider1.dvr.rp.1234-2345-63434-asdf"
                    }
                }
            ]
        }
    }
}

In this code, there are two different mediaIdentifier values matching the user's request. The values for these mediaIdentifier properties correspond with the content IDs in your catalog.

After Alexa receives the GetPlayableItemsResponse response, Alexa might ask the user to clarify which media title the user wants to play, or which video provider the user wants to play the media from (in cases where multiple providers have the same media).

After resolving the media the user wants to play, Alexa will then send another directive to your Lambda called GetPlayableItemsMetadata. This directive asks your Lambda for more details about the chosen media title. Alexa needs this in order to show information about this media title (the value for the mediaIdentifier — for example, recordingId://provider1.dvr.rp.1234-2345-63434-asdf — means nothing to Alexa). You need to supply information related to this mediaIdentifier that indicates what to show on the user's screen, such as the title, thumbnail, duration, rating, etc.

The GetPlayableItemsMetadata that Alexa sends might look like this:

Alexa Request: GetPlayableItemsMetadata

{

    "directive": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "0f918d6e-ebae-48f1-a237-13c6f5b9f5da",
            "name": "GetPlayableItemsMetadata",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "endpoint": {
            "scope": {
                "type": "BearerToken",
                "token": "access-token-from-skill"
            },
            "endpointId": "videoDevice-001",
            "cookie": {
            }
        },
        "payload": {
            "locale": "en-US",
            "mediaIdentifier": {
                    "id": "recordingId://provider1.dvr.rp.1234-2345-63434-asdf"
                }
        }
    }
}

You can see here that the payload identifies a specific mediaIdentifier that the user wants to play.

Your Lambda then retrieves the needed information about this mediaIdentifier and returns a GetPlayableItemsMetadataResponse response with more information about it. The response might look as follows:

Lamba Response: GetPlayableItemsMetadataResponse

{
    "event": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "38ce5b22-eeff-40b8-a84f-979446f9b27e",
            "name": "GetPlayableItemsMetadataResponse",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "payload": {
            "searchResults": [
                {
                    "name": "Interstellar",
                    "contentType": "ON_DEMAND",
                    "series": {
                        "seasonNumber": "1",
                        "episodeNumber": "1",
                        "seriesName": "The Big Bang Theory",
                        "episodeName": "Pilot"
                    },
                    "playbackContextToken": "{\"streamUrl\": \"http:\/\/samplemediasite.com\/sample\/video.mp4\", \"title\": \"Some Video Title\"}",
                    "parentalControl": {
                        "pinControl": "REQUIRED"
                    },
                    "absoluteViewingPositionMilliseconds": 1232340
                }
            ]
        }
    }
}

Alexa then passes the playbackContextToken to your web player, which then converts this identifier into a media playback URL and loads the media.

An Analogy of the Interaction

To put this interaction more concretely, consider this analogy. A customer walks into a video store and asks the clerk, "I want to watch To Kill a Mockingbird." The clerk passes on the request to a backroom worker who looks through the media library and locates the Mockingbird media section. He finds that there are multiple matches for this media, with different editions and variations, some matches with Gregory Peck and other matches with Mockingjay in Hunger Games.

The backroom worker relays the info back to the clerk, who then asks the customer, "Which of these media titles do you actually want?" The customer says, "I want the first one (with Gregory Peck)." The clerk then turns to the backroom worker and says the customer wants the Gregory Peck media title. The backroom worker retrieves all the details about the Gregory Peck title and returns this info to the clerk. The clerk loads the media into a player and plays it for the customer.

In short, the user's request gets converted to a GetPlayableItems directive sent to your Lambda. Your Lambda responds with a GetPlayableItemsResponse listing the matching titles. Alexa replies with a GetPlayableItemsMetadata directive for the title the user selects, and your Lambda replies with a GetPlayableItemsMetadataResponse containing all the details for playback.

Exactly how you code your Lambda and perform the necessary backend services to retrieve the right data is up to you. The documentation here will not provide tutorials that describe how to interact with your backend services to gather and generate the needed responses because each partner's code and backend services differs, as these backend services differ considerably from partner to partner.

Stepping Section by Section through the Sample Lambda

The following sections will explain the logic in the sample Lambda function. The sample Lambda function, provided in Step 1: Create Your Video Skill and Lambda Function, specifically the section Step 1.3: Create the Lambda Function for Your Video Skill, tries to demonstrate the required responses for several directive types. Let's unpack the Lambda function section by section.

Note: Keep in mind that this Lambda doesn't represent what your Lambda will look like. The hardCodedResponse function here bypasses the lookups that your backend service will need to perform. To handle the incoming events arriving at your Lambda, your code will need to interface with some backend service to perform lookups, searches, etc., and return the needed information. To avoid getting lost in detailed code that might include a lot of logic foreign to your own implementation, we've simply hard-coded the responses here to show you the requirements of the response.

Also, note that you can provide your Lambda code in a variety of languages. See the following AWS Lambda documentation topics for instructions on working with other languages:

This tutorial uses Node JS. Here's the full sample Lambda. After the code, we'll step through this section by section.

Note: For convenience in breaking up the code, various "section" labels have been inserted. These section labels correspond with the section subheadings that follow.

// section 1 begin
var AWS = require('aws-sdk');

exports.handler = (event, context, callback) => {
    console.log("Interaction starts");
    hardCodedResponse(event, context);
};
// section 1 end

// section 2 begin
function hardCodedResponse(event, context) {
    var name = event.directive.header.name;
    console.log("Alexa Request: ", name, JSON.stringify(event));
// section 2 end

// section 3 begin
var DiscoverResultResponse = {
    "event": {
        "header": {
            "namespace": "Alexa.Discovery",
            "name": "Discover.Response",
            "payloadVersion": "3",
            "messageId": "ff746d98-ab02-4c9e-9d0d-b44711658414"
        },
        "payload": {
            "endpoints": [{
                "endpointId": "ALEXA_VOICE_SERVICE_EXTERNAL_MEDIA_PLAYER_VIDEO_PROVIDER",
                "endpointTypeId": "TEST_VSK_MM",
                "manufacturerName": "TEST_VSK_MM",
                "friendlyName": "TEST_VSK_MM",
                "description": "TEST_VSK_MM",
                "displayCategories": ["APPLICATION"],
                "cookie": {},
                "capabilities": [{
                    "type": "AlexaInterface",
                    "interface": "Alexa.RemoteVideoPlayer",
                    "version": "1.0"
                }, {
                    "type": "AlexaInterface",
                    "interface": "Alexa.PlaybackController",
                    "version": "1.0"
                }, {
                    "type": "AlexaInterface",
                    "interface": "Alexa.SeekController",
                    "version": "1.0"
                }, {
                    "type": "AlexaInterface",
                    "interface": "Alexa.ChannelController",
                    "version": "1.0"
                },
    {
        "type": "AlexaInterface",
  "interface": "Alexa.MultiModalLandingPage",
  "version": "1.0"
    }]
            }]
        }
    }
};

var GetPlayableItemsResponse = {
    "event": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "5f0a0546-caad-416f-a617-80cf083a05cd",
            "name": "GetPlayableItemsResponse",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "payload": {
            "nextToken": "fvkjbr20dvjbkwOpqStr",
            "mediaItems": [{
                "mediaIdentifier": {
                    "id": "tt1254207"
                }
            }]
        }
    }
};

var GetPlayableItemsMetadataResponse = {
    "event": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "38ce5b22-eeff-40b8-a84f-979446f9b27e",
            "name": "GetPlayableItemsMetadataResponse",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "payload": {
            "searchResults": [{
                "name": "Big Buck Bunny",
                "contentType": "ON_DEMAND",
                "series": {
                    "seasonNumber": "1",
                    "episodeNumber": "1",
                    "seriesName": "Blender Foundation Videos",
                    "episodeName": "Pilot"
                },
                "playbackContextToken": "{\"streamUrl\": \"http:\/\/commondatastorage.googleapis.com\/gtv-videos-bucket\/sample\/BigBuckBunny.mp4\", \"title\": \"Big Buck Bunny\"}",
                "parentalControl": {
                    "pinControl": "REQUIRED"
                },
                "absoluteViewingPositionMilliseconds": 1232340
            }]
        }
    }
};

var GetDisplayableItemsResponse = {
    "event": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "5f0a0546-caad-416f-a617-80cf083a05cd",
            "name": "GetDisplayableItemsResponse",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "payload": {
            "nextToken": "fvkjbr20dvjbkwOpqStr",
            "mediaItems": [{
                "mediaIdentifier": {
                    "id": "tt1254207"
                }
            }, {
                "mediaIdentifier": {
                    "id": "tt0807840"
                }
            }]
        }
    }
};

var GetNextPageResponse = {
     "event": {
         "header": {
             "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
             "messageId": "9f4803ec-4c94-4fdf-89c2-d502d5e52bb4",
             "name": "GetNextPageResponse",
             "namespace": "Alexa.VideoContentProvider",
             "payloadVersion": "3"
         },
         "endpoint": {
             "scope": {
                 "type": "BearerToken",
                 "token": "Alexa-access-token"
             },
             "endpointId": "appliance-001"
         },
         "payload": {
             "nextToken": "qefjrfiugef74",
             "mediaItems": [{
                     "mediaIdentifier": {
                         "id": "tt0807840"
                     }
                 },
                 {
                     "mediaIdentifier": {
                         "id": "tt1254207"
                     }
                 },
                                     {
                     "mediaIdentifier": {
                         "id": "tt7993892"
                     }
                 },
                                     {
                     "mediaIdentifier": {
                         "id": "tt2285752"
                     }
                 },
                                     {
                     "mediaIdentifier": {
                         "id": "tt4957236"
                     }
                 }
             ]
         }
     }
 }

var GetDisplayableItemsMetadataResponse = {
    "event": {
        "header": {
            "correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
            "messageId": "38ce5b22-eeff-40b8-a84f-979446f9b27e",
            "name": "GetDisplayableItemsMetadataResponse",
            "namespace": "Alexa.VideoContentProvider",
            "payloadVersion": "3"
        },
        "payload": {
            "resultsTitle": "SearchResults",
            "searchResults": [{
                "name": "Big Buck Bunny",
                "contentType": "ON_DEMAND",
                "itemType": "VIDEO",
                "releaseYear": "2014",
                "selectionAction": "PLAY",
                "thumbnailImage": {
                    "contentDescription": "Big Buck Bunny image",
                    "sources": [{
                        "url": "https:\/\/devportal-reference-docs.s3-us-west-1.amazonaws.com\/video-skills-kit\/bigbuckbunnythumb.png",
                        "size": "X_LARGE",
                        "widthPixels": 1920,
                        "heightPixels": 1280
                    }]
                },
                "runtime": {
                    "runTimeInMilliseconds": 5400000,
                    "displayString": "9m"
                },
                "closedCaption": {
                    "status": "AVAILABLE",
                    "displayString": "CC"
                },
                "absoluteViewingPositionMilliseconds": 0,
                "parentalControl": {
                    "pinControl": "REQUIRED"
                },
                "viewingDisplayString": "PurchaseOptions",
                "reviews": [{
                    "totalReviewCount": 1897,
                    "type": "FIVE_STAR",
                    "ratingDisplayString": "4.06"
                }],
                "rating": {
                    "category": "G"
                },
                "mediaIdentifier": {
                    "id": "tt1254207"
                }
            }]
        }
    }
};
// section 3 end

// section 4 begin
if (name === 'Discover') {
    console.log("Lambda Response: DiscoverResultResponse", JSON.stringify(DiscoverResultResponse));
    context.succeed(DiscoverResultResponse);
} else if (name === 'GetPlayableItems') {
    console.log("Lambda Response: GetPlayableItemsResponse", JSON.stringify(GetPlayableItemsResponse));
    context.succeed(GetPlayableItemsResponse);
} else if (name === 'GetPlayableItemsMetadata') {
    console.log("Lambda Response: GetPlayableItemsMetadataResponse", JSON.stringify(GetPlayableItemsMetadataResponse));
    context.succeed(GetPlayableItemsMetadataResponse);
} else if (name === 'GetDisplayableItems') {
    console.log("Lambda Response: GetDisplayableItemsResponse", JSON.stringify(GetDisplayableItemsResponse));
    context.succeed(GetDisplayableItemsResponse);
} else if (name === 'GetDisplayableItemsMetadata') {
    console.log("Lambda Response: GetDisplayableItemsMetadataResponse", JSON.stringify(GetDisplayableItemsMetadataResponse));
    context.succeed(GetDisplayableItemsMetadataResponse);
}
else if (name === 'GetNextPage') {
    console.log("Lambda Response: GetNextPageResponse", JSON.stringify(GetNextPageResponse));
    context.succeed(GetNextPageResponse);
}
};
// section 4 end

Section 1 Explanation

// section 1 begin
var AWS = require('aws-sdk');

exports.handler = (event, context, callback) => {
    console.log("Interaction starts");
    hardCodedResponse(event, context);
    ...
  }
// section 1 end

First we declare a dependency on the AWS SDK for JavaScript in Node.js. This SDK allows your Node JS code to perform a number of functions inside of AWS, which you can read about in the AWS SDK documentation.

The handler method is explained in AWS Lambda Function Handler in Node.js. When a Lambda function is invoked, AWS Lambda starts executing your code by calling the handler function. AWS Lambda passes any event data to this handler as the first parameter. The runtime passes three arguments to the handler method: event, context, and callback:

event: The first argument is the event object, which contains information from the invoker. In this case the invoker is the Alexa directive. Alexa passes this directive as a JSON-formatted string when it calls Invoke.
context: The second argument is the context object, which contains information about the invocation, function, and execution environment.
callback: The third argument, callback, is a function that you can call in non-async functions to send a response. The callback function takes two arguments: an Error and a response. The response object must be compatible with JSON.stringify.

Your handler should process the incoming event data and may invoke any other functions/methods in your code.

To see the event and context, you can log these to the console:

console.log("Alexa Request: " + JSON.stringify(event, null, 2));
console.log("Context: " + JSON.stringify(context, null, 2));

The event is logged as part of the hardCodedResponse function; the context isn't logged at all in the sample Lambda.

The event and context are JSON objects. JSON.stringify renders a JSON object as a string. The JSON.stringify method takes several parameters: the object, a replacer (not used here), and a spacing value.

The context isn't necessarily important here, but it shows the name of the Lambda function invoked, the log stream, the memory used, and other details. In Cloudwatch, the context for an event in our workflow looks as follows:

Context:
{
    "callbackWaitsForEmptyEventLoop": true,
    "functionVersion": "$LATEST",
    "functionName": "hawaii_echo_lambda",
    "memoryLimitInMB": "128",
    "logGroupName": "/aws/lambda/hawaii_echo_lambda",
    "logStreamName": "2019/06/21/[$LATEST]2eaa24e01fff497187f6d0fcc2230e8d",
    "invokedFunctionArn": "arn:aws:lambda:us-east-1:458179560631:function:hawaii_echo_lambda",
    "awsRequestId": "1b0c6361-bbe5-440b-95cb-3024f3abfa53"
}

You can see that hawaii_echo_lambda is the Lambda invoked, and logs in Cloudwatch are grouped in /aws/lambda/hawaii_echo_lambda.

After this initial handler method, the hardCodedResponse(event, context); function runs, taking in the event and context as parameters. This function is explained in the next section.

Section 2 Explanation

// section 2 begin
function hardCodedResponse(event, context) {
    var name = event.directive.header.name;
    console.log("Alexa Request: ", name, JSON.stringify(event));
// section 2 end
...}

The sample Lambda code contains a function called hardCodedResponse that passes in the event and context as parameters. This shows how you can get information from the incoming event (the Alexa request) and set variables for the information contained in that event.

For example, the sample Lambda code sets a variable called name to store the value for event.directive.header.name.

As needed, you can set any properties in the directive to some variable that you use in your lookups. The information you need (and how you manipulate it) depends on the task you're trying to perform. For example, you might want to perform a lookup based on a movie title. As such, you might need certain details to feed into your lookup functions with your backend services.

Section 3 Explanation

At this point, the sample Lambda code simply defines variables for the pre-defined responses that it will respond to the directive with. For example:

//section 3 begin
var DiscoverResultResponse = {
  ...
};

var GetPlayableItemsResponse = {
...
};

var GetPlayableItemsMetadataResponse = {
    ...
};

var GetDisplayableItemsResponse = {
  ...
};

var GetNextPageResponse = {
  ...
};

var GetDisplayableItemsMetadataResponse = {
  ...
};
//section 3 end

As noted earlier, the sample Lambda hard-codes these responses. In a real implementation, you need to retrieve the needed information dynamically through your backend service. For example, based on the incoming payload, you would take the information and plug this into your own logic for resolving the user's request.

The sample Lambda here doesn't have a backend service with media information and such, so the responses are simply pre-defined. As a result, if you're using this sample Lambda in a test to explore how video skills work on multimodal devices, you're limited to queries for the content defined here.

In the future, a more dynamic sample Lambda might be made available in this documentation. However, since the lookup process will vary drastically from partner to partner based on their differing backend services and programming languages, this extra code to query a backend service might not be that instructive.

Section 4 Explanation

// section 4 begin
    if (name === 'Discover') {
        console.log("response", JSON.stringify(discoverResult));
        context.succeed(discoverResult);
    } else if (name === 'GetPlayableItems') {
        console.log("response", JSON.stringify(getPlayableItems));
        context.succeed(getPlayableItems);
    } else if (name === 'GetPlayableItemsMetadata') {
        console.log("response", JSON.stringify(getPlayableItemsMetadata));
        context.succeed(getPlayableItemsMetadata);
    } else if (name === 'GetDisplayableItems') {
        console.log("response", JSON.stringify(getDisplayableItems));
        context.succeed(getDisplayableItems);
    } else if (name === 'GetDisplayableItemsMetadata') {
        console.log("response", JSON.stringify(getDisplayableItemsMetadata));
        context.succeed(getDisplayableItemsMetadata);
    }
    else if (name === 'GetNextPage') {
        console.log("Lambda Response: GetNextPageResponse", JSON.stringify(GetNextPageResponse));
        context.succeed(GetNextPageResponse);
    }
};

// section 4 end

The final section of the function returns the appropriate response based on the directive name. If the directive was GetPlayableItems, then the GetPlayableItemsResponse is passed back to Alexa in the callback. The context.succeed method puts the information into the callback.

In the Cloudwatch logs, look for a line that begins with Lambda Response to see the response that your Lambda sends back to Alexa.

Next Steps

Go on to Step 4: Understand How Your Web Player Gets the Media Playback URL.

Last updated: Nov 02, 2020