API Reference Overview (VSK FTV)
Alexa converts the utterances users say (for example, to search for a TV show, or to play a movie) into directives. A directive is a set of data and instructions, expressed in JSON, that Alexa sends to your Lambda. Video skills for Fire TV apps can support a variety of directives, such as
SearchAndDisplayResults, and more. Different interfaces (APIs) from the Video Skill API send different directives to your Lambda.
Your Lambda must interpret and handle the directive to fulfill the user's request. Your Lambda both sends a response back to Alexa and takes the appropriate action to fulfill the request.
- Available Directives
- Targeting Your Video Skill
- Comparison with Multimodal Directives
- Terminology – Requests versus Directives
Alexa sends the following directives with Fire TV apps.
||Sent when users ask Alexa to play specific video content.|
||Sent when users ask Alexa to search for video content.|
||Sent when users request to play, stop, and navigate playback for video content.|
||Sent when users request to fast-forward (or skip) or rewind to a specific duration.|
||Sent when users request to change the channel|
||Sent when users request to start or stop recordings.|
||Sent when users request to search, cancel, or delete recordings.|
||Sent when users request to scroll right or left, page up or down, or select the item in focus.|
The details for each of these directives, as well as the utterances that trigger the directives, are described in the links above.
Targeting Your Video Skill
To target your video skill with the utterance, do the following:
- Say the utterance with your app open.
- Make your video skill's name explicit in the request, such as "Play [X] Show on XYZ" rather than just "Play [X] Show." (This is called an explicit utterance.)
Comparison with Multimodal Directives
Implementing Video Skills Kit for Multimodal Devices also involves interpreting and respond to directives from Alexa, as described in Directives Reference Overview. The directives aren't the same as those used for Fire TV apps, but they are similar:
SearchAndPlay(FTV) is similar to
GetPlayableItems(MM). These directives support Play utterances.
SearchAndDisplayResults(FTV) is similar to
GetDisplayableItems(MM) These directives support search utterances.
However, multimodal devices have two directives that are made for each of the above (
GetDisplayableItemsMetadata), because the fundamental interaction model is different. With multimodal devices, your Lambda feeds the information back to Alexa in the response. With Fire TV apps, your Lambda pushes the needed information directly to your app through Amazon Device Messaging.
Terminology – Requests versus Directives
The term "request" and "directive" are mostly synonymous in the video skills documentation here. Request is a more general term for any message Alexa sends to your Lambda. With video skills, the messages are labeled as a
directive, in the code so we refer to the requests as "directives." This aligns with terminology used in other Alexa Skills Kit documentation.
Additionally, the term "directive" provides some differentiation between the user's utterance (e.g., a request to play a movie) and the information that Alexa sends to your Lambda.