Alexa.Media.Playback Interface

The Alexa.Media.Playback interface enables Alexa to start immediate playback of content on an Alexa device.

Understand ContentId

To build a high quality music or podcast skill, you must understand ContentId. ContentId identifies a listening experience that a skill can return and play on a device. A ContentId can reference a track, an editorial playlist of popular songs, a custom (artist- or genre-seeded) station, an album, or a season of a program series.

ContentId must be globally unique within your skill, be long-lived, and always represent the same experience for all skill users. For example, imagine that a user says, "Alexa, play the album Rainier Fog by Alice in Chains", and the skill returns a ContentId of "123" to represent the album. This same ContentId should represent this album for all users. When another user says, "Alexa, play the album Rainier Fog by Alice in Chains", your skill should send the same ContentId of "123" in response to the GetPlayableContent request, even if the request happens one year after the original user's request.

Here is another example: Imagine that your music service has a "Top Weekly Songs" playlist, where the list of songs in the playlist changes from week to week to reflect the most popular songs on the charts. the user says, "Alexa, play the top weekly songs playlist". Your skill responds with a ContentId of "321" which represents the "Top Weekly Songs" playlist. When Alexa sends this ContentId in an Initiate request, your skill returns the first track of the playlist, for example, Shallow by Lady Gaga. One month later, the user says, "Alexa, play the top weekly songs playlist". Your skill again responds with a ContentId of "321" because this ContentId always represents the "Top Weekly Songs" playlist. However, this time when Alexa sends this ContentId in an Initiate request, your skill returns, for example, the song Better Now by Post Malone, because the playlist contents change weekly.

When a user sets a podcast alarm (for example, "Alexa, wake me up to the John Smith podcast from skill name at 8 AM"), Alexa saves the ContentId returned in the GetPlayableContent response. Each time the alarm is triggered, which might be months later for a repeating alarm, Alexa sends an Initiate request to your skill with the saved ContentId. The resulting queue of programs might be different because the program series changes daily, but the user is still listening to the "As It Happens" program series, so the result is correct.

When a user sets a music alarm (for example, "Alexa, wake me up to Can't Stop The Feeling by Justin Timberlake from skill name at 8 AM"), Alexa saves the ContentId returned in the GetPlayableContent response. Each time the alarm is triggered, which might be months later for a repeating alarm, Alexa sends an Initiate request to your skill with the saved ContentId, and the response should reflect the content the user requested when setting the alarm.

Similarly, when a user browses their history of music requests and selects an item to replay, Alexa calls Initiate with the saved ContentId. In the preceding example for "Top Weekly Songs," if the user sees "Top Weekly Songs" in their history and clicks to play it again, Alexa sends an Initiate request with a ContentId of "321." The resulting queue of songs might be different because the playlist changes weekly, but the user is still listening to the "Top Weekly Songs" playlist, so the result is correct.

Utterances

When you use the Alexa.Media.Playback interface, the voice interaction model is already built for you. The following example show a customer utterance:

Alexa, play the song Jeremy by Pearl Jam.
Alexa, play the podcast program series name.
Alexa, resume the program program series name.

Configure your skill to receive requests

You must configure your music skill to support this API before Alexa will send requests to it. You can configure your skill in the following ways:

Supporting premium audio

When the provider supports premium audio, the Initiate request contains a list of Endpoint objects which identify the content type identifiers that the provider can provide, and which are playable on the target device. An endpoint corresponds to a playback device. Currently the list only contains a single Endpoint object which is the target device. Within each Endpoint is a list of ContentFormat objects which contain the content type identifiers for the provider to choose from when it's choosing a playback stream.

Directives

Initiate directive

When Alexa receives a content identifier from a skill's GetPlayableContent response and is ready to start immediate playback of the content on an Alexa device, Alexa sends an Initiate request. The request includes the content identifier, and the skill responds with the stream URI for immediate playback of the content. The following table shows which content types use this directive:

Content type Required?
Music Required
Radio Required
Podcast Required

There are three primary scenarios that cause Alexa to call this directive:

  1. The user requested music, radio, or a program series to play, so playback is initiated immediately.
  2. A previously set music, radio, or podcast alarm is triggered. For example, the user set an alarm to play a song at 7:00 AM, so at that time Alexa makes an Initiate call to the skill.
  3. The user selects content from a play history UI that shows (for example in the Alexa app, or on an Alexa device with a screen) to hear the content again.

Podcast skills define two types of program series: serial and episodic (non-serial). In a serial program series, users expect your skill to play episodes in order, from oldest to latest. In an episodic program series, each episode is a stand-alone program, and users expect your skill to play episodes from latest to oldest by default. Therefore, in its response to the Initiate directive, your skill should return the first item with the oldest program for a serial program series and return the latest program for a non-serial (episodic) program.

Initiate directive payload details

Field Description Type
requestContext Context information about the request. A RequestContext object.
filters Filters to apply during content resolution. A Filter object.
contentId The content to use for creating a playback queue. String
currentItemReference The item that's currently playing (active) on the target endpoint, if any. This property is absent when nothing is playing. Your skill should use this property to enforce concurrency limits. Specifically, it should use this property to determine whether the playback session starts on an endpoint where no stream is playing, or whether it replaces an existing stream on an endpoint. A MediaReference object.
playbackModes The playback modes requested by the user. If the user doesn't mention anything about a looped or shuffled queue, this attribute defaults to false for all supported playback modes. Object
playbackModes.shuffle True to shuffle the queue, false to play the queue in order.
Note: Ignored for podcast skills.
Boolean
playbackModes.loop True to start playing the queue again after it finishes, false to end.
Note: Ignored for podcast skills.
Boolean
playbackPosition (Podcast only) The position where playback should begin, based on the Alexa user's requirements. The only supported value is RESUME. If the user doesn't mention anything about where to start playback, this attribute is absent. String
endpoints (Premium audio only) A list of Endpoint objects containing the content type identifiers that the music provider supports that are playable on the target device. See the Endpoint object for more information. This field is present only if the provider supports premium audio. List of objects

Initiate directive examples (music)

In the following example, there is no current content.

In the following example, there is current content.

The following example demonstrates support for premium audio.

Initiate response directive examples (podcast)

When a user says, "Alexa, play the podcast/program <program series name>", the skill should return a valid response (containing a content reference) to the GetPlayableContent directive. Alexa then plays the content from that response. To start playback, Alexa sends an Initiate request, similar to the following example, instructing the skill to create a queue from the content reference.

The following example demonstrates the request that occurs when a user asks Alexa to resume content, for example "Alexa, resume the podcast/program <program series name>."

Initiate response event

If you handle a Initiate directive successfully, respond with an Alexa.Response event.

In response to the first of the preceding examples, the skill creates a queue for the user based on the requested ContentId and returns the queue identifier and the first audio item to Alexa. The Initiate response should contain enough information for Alexa to know how to manage the queue, and the first track to play for the user. To get the second track to play for the user, Alexa calls GetNextItem after beginning to play the first track. Subsequent tracks are also retrieved with GetNextItem after each track begins playback.

The time it takes your skill to respond to an Initiate request directly impacts the Alexa user experience. Music skills should adhere to the following response latency limits.

Call Percentage Latency Limit (in milliseconds)
50% 100 ms
90% 250 ms
99% 400 ms

Initiate response event payload details

Field Description Type Required
playbackMethod Information about the playback method that Alexa should use to achieve playback for the user, and the first track. A PlaybackMethod object. Yes

Initiate response event examples (music)

The following example shows a response to the Initiate directive.

In the following example, the skill returns information about the item that Alexa should play for the user.

The following Initiate response example demonstrates support for premium audio.

The following example demonstrates support for digital rights management (DRM).

The following example adds a background image for Alexa to display while playing music. For more information, see the background field of the BaseMetadata object.

Initiate response event examples (podcast)

In response to the preceding Initiate directive example, the skill creates a queue for the user based on the requested ContentId and returns the queue identifier and the first audio item to Alexa. The Initiate response should contain enough information for Alexa to know how to manage the queue, and the first program to play for the user. To get the second program to play for the user, Alexa makes an additional call after beginning to play the first program.

To respond a resume request, your skill should use the offsetInMilliseconds field in the returned stream object to indicate where to start playback. The following example shows a response for a resume request.

To disable fast-forward and rewind for a program series or program, respond without the SEEK_ADJUST control as shown in the following example.

Following is a sample response with the latest program content. The isLatest flag should be true when a program is the latest, or most recently released, program in a program series. When the isLatest flag is true, the customer receives a prompt indicating that what they're about to hear is the latest episode. When set to false, or not specified in the response, the customer might hear a hint suggesting that they can ask to play the latest episode. If the isLatest flag isn't implemented, customers receive incorrect prompts and playback behavior.

Initiate directive error handling

If your skill can't handle a Initiate directive successfully, it should respond with an Alexa.Media.ErrorResponse event or an Alexa.ErrorResponse event. For more information, see Alexa Music, Radio, and Podcast Skill API Error Responses.