About the Alexa Voice Service (AVS) AudioPlayer
The Alexa Voice Service (AVS) includes an AudioPlayer Interface for managing, controlling, and reporting on streaming audio content. For example, Amazon Music, Flash Briefing, Audible, and TuneIn skills all rely on the AudioPlayer Interface for streaming audio functionalities.
AVS sends directives to your device instructing the device to perform an action, such as playing an audio stream. In response, AVS expects the device to return events in a specific order the device performs the actions. This topic provides conceptual information, definitions, and sequence diagrams to help you implement the AudioPlayer Interface interface.
Audioplayer best practices
Use the following guidelines to provide a familiar Alexa experience to your customers.
Recommended media support
Play
directives provide audio in a variety of formats, containers, and bit rates. For more details about the codecs, containers, streaming formats, and playlists that your product should support, see Recommended Media Support.
Playback queue management
Creating and managing the device playback queue ensures that media services associated with the AudioPlayer Interface work as designed. The device playback queue must:
- Have the ability to handle multiple
Play
directives. - Use the
playBehavior
in the payload of eachPlay
directive to adjust or maintain the device playback queue. - Match the
token
of the active stream with theexpectedPreviousToken
of the stream that you're adding to the queue. If the tokens don't match, ignore the stream. However, if Alexa doesn't return anyexpectedPreviousToken
, add the stream to the queue. - Clear the queue whenever the device receives a
ClearQueue
directive.
Audioplayer interaction sequence
Imagine that you're in the kitchen cooking dinner. Rather than reach for your phone to play music, you say, "Alexa, play some music." The following process describes the sequence of interactions between your device and AVS, resulting in your device playing the music that you requested.
-
Your device sends a
Recognize
event, including the captured speech as a binary audio attachment, to AVS. AVS processes and translates the captured audio into a series of directives and potentially corresponding audio attachments. AVS sends these directives and any applicable audio attachments to your device, instructing the device to perform one or more actions. - The device handles the
Speak
directive, which instructs your device to play Alexa speech and sends aSpeechStarted
event when the device starts playback of Alexa speech, such as, "Shuffling your music." -
The device sends a
SpeechFinished
event when playback of Alexa speech finishes. - The device handles the
Play
directive, which instructs your device to start playback of your music. For more details about thePlay
directive, see the sectionPlay
directive walkthrough.
When playback begins your device sends a series of lifecycle events to AVS. These events notify Alexa playback has started, request the next stream and provide progress reporting information to AVS and music service providers.
- PlaybackStarted – The device sends a
PlaybackStarted
event to AVS when playback begins. TheoffsetInMilliseconds
sent to AVS should match the offset provided in thePlay
directive. - PlaybackNearlyFinished – Send the
PlaybackNearlyFinished
event when your device is ready to buffer/download the next stream in your playback queue. One option is to send this event following thePlaybackStarted
event to start buffering and reduce lag between playback of streams. - ProgressReportDelayElapsed – Send the
ProgressReportDelayElapsed
event to AVS if thePlay
directive includes aprogressReportDelayInMilliseconds
. - ProgressReportIntervalElapsed – Send the
ProgressReportIntervalElapsed
event to AVS if thePlay
directive includes aprogressReportIntervalInMilliseconds
. - PlaybackFinished – Send the
PlaybackFinished
event when your device finishes playing a stream. - PlaybackStopped – Send the
PlaybackStopped
event if your device receives aStop
directive and stops playback.
Play
directive walkthrough
Remember that after asking Alexa to play music, AVS returns a Play
directive to your device, which instructs the device to start playing an audio stream or binary audio attachment. The payload provides your device with all the information needed to handle an audio stream and add it to your local playback queue, such as the stream URL, when the stream URL expires, the expected playback behavior, and progress reporting requirements.
The following example shows a payload from a Play
directive:
{ "directive": { "header": { "namespace": "AudioPlayer", "name": "Play", "messageId": "42941f13-90ed-4d9e-8159-xxxxxxxx", "dialogRequestId": "req:a345fgh598383xxx"" }, "payload": { "playBehavior": "REPLACE_ALL", "audioItem": { "audioItemId": "test1.as-ct.v1.XYZ-ABCDE-FGHIJ#ACRI#url#ACRI#0f6bcd24-f621-555a-822c-1111111:1", "stream": { "url": "https://opml.radiotime.com/Tune.ashx?serial=SAMPLE&formats=aac,mp3&partnerId=SAMPLE", "streamFormat": "AUDIO_MPEG", "offsetInMilliseconds": 0, "expiryTime": "2016-09-13T18:22:49+0000", "progressReport": { "progressReportDelayInMilliseconds": 15000, "progressReportIntervalInMilliseconds": 900000 }, "token": "test1.as-ct.v1.XYZ-ABCDE-FGHIJ#ACRI#url#ACRI#0f6bcd24-f621-555a-822c-1111111:1" } } } } }
The following steps walk through the Play
directive payload, explaining the parameter values in detail:
- The first payload parameter is
playBehavior
, which provides information about how this particularPlay
directive impacts your local playback queue. AVS supports the following three play behaviors:REPLACE_ALL
– Instructs your device to begin playback of the stream included in the payload and replace any enqueued streams in your local playback queue.
In this example, the
playBehavior
value isREPLACE_ALL
, meaning that your device must clear its local playback queue and then start playback of the audio stream included in the payload.
ENQUEUE
– Instructs your device to add the stream contained in thePlay
directive to the end of your current playback queue.REPLACE_ENQUEUED
– Instructs your device to replace all streams in your local playback queue. This doesn't impact the actively playing stream.
- The next item in the payload is the
audioItem
object, which includesaudioItemId
andstream
:audioItemId
– Opaque token that identifies the audio stream.stream
– Object that provides specific information about the audio stream, including:
url
– URL of the audio content. If the audio content is a binary audio attachment, the value is a unique identifier for the content formatted with the following prefix:cid:
.streamFormat
– Format of the audio stream.offsetInMilliseconds
– Offset in milliseconds from which your device should start playback of the audio stream.expiryTime
– Timestamp for when the stream is to become invalid in ISO 8601 format.progressReport
– Object that contains information about the progress reports required by the content provider.progressReport
supportsprogressReportIntervalInMilliseconds
andprogressReportDelayInMilliseconds
.progressReportDelayInMilliseconds
– Offset for when to send the initial progress report. The device sends this event at the exact interval specified in thePlay
directive.progressReportIntervalInMilliseconds
– Offset for when progress reports must be periodically sent, which is each time the offset elapses from the start of the track.
token
– Opaque token that represents the current audio stream.
For a complete listing of directives/events and associated behaviors, see the AudioPlayer Interface.
Progress reporting
AVS uses the progress reporting part of the Play
directive to describe which measures of progress reporting that a content provider requires for a given audio stream. If a Play
directive payload contains progressReportDelayInMilliseconds
, progressReportIntervalInMilliseconds
, or both, these parameters indicate that the audio content provider requires progress reporting for this specific stream. When determining when to send a progress report, your device must send the progress report events at the start of a stream, not the offset specified by the Play
directive.
When these parameters are present, your device must send the following corresponding lifecycle events:
progressReportDelayInMilliseconds
– If thePlay
directive contains theprogressReportDelayInMilliseconds
parameter, the device must send the following events to AVS:ProgressReportDelayElapsed
– Send this event at the specified interval from the start of the stream, not from theoffsetInMilliseconds
. For example, if thePlay
directive containsprogressReportDelayInMilliseconds
with a value of20000
, send theProgressReportDelayElapsed
event 20,000 milliseconds after the start of the track. However, if thePlay
directive contains anoffsetInMilliseconds
value of10000
andprogressReportDelayInMilliseconds
value20000
, send the event 10,000 milliseconds into playback.
progressReportIntervalInMilliseconds
– If thePlay
directive contains theprogressReportIntervalInMilliseconds
parameter, send theProgressReportIntervalElapsed
event to AVS periodically at the specified interval from the start of the stream, not from theoffsetInMilliseconds
. For example, if thePlay
directive containsprogressReportIntervalInMilliseconds
with a value of20000
, send theProgressReportIntervalElapsed
event 20,000 milliseconds from the start of the track and every 20,000 milliseconds until the stream ends. However, if thePlay
directive contains anoffsetInMilliseconds
value of10000
andprogressReportIntervalInMilliseconds
value of20000
, send the event 10,000 milliseconds from the start of playback, and every 20,000 milliseconds after that until the stream ends.
Event sequences for common AVS audioplayer use cases
The following diagrams illustrate lifecycle events that your device should send and related actions that the device should take in response to directives sent by AVS. In conjunction with logs produced by the AVS Device SDK, you can use these diagrams to troubleshoot development and certification issues.
Scenario 1: "Alexa, play rock music from iHeartRadio."
Consider a scenario where a user makes a request to play rock music from iHeartRadio. The following diagram provides the appropriate sequencing of events sent to and directives expected from AVS. In this example, the first stream plays until completion, and the device sends a PlaybackFinished
event.

Scenario 2: Stop and resume an audio stream
Consider a scenario the user plays a song, and 45 seconds into playback, the user says, "Alexa, stop." After waiting 10 seconds, the user says, "Alexa, resume."
In this example, the user makes a request to stop audio playback. When the user interrupts audio playback, the device pauses the playback temporarily because the Dialog channel is active and in the foreground. When this occurs, your device must send PlaybackPaused
. After AVS identifies your request, AVS sends StopCapture
and Stop
directives to instruct your device to close the microphone and to stop audio playback on the Content channel. In response to the Stop
directive, the device sends a PlaybackStopped
event. This scenario differs from the previous scenario, where the device sent a PlaybackFinished
event when the stream played to completion.
PlaybackPaused
only when temporarily pausing audio to accommodate higher priority content. In this scenario, the higher priority content is the user request on the Dialog channel. Send PlaybackStopped
in response to a Stop
directive.The following diagram illustrates the device sending the correct progress reports from the origination of a stream and highlights the use of channels to direct audio outputs, such as audio playback and Alexa speech.

Scenario 3: Use a physical control on a device to navigate to the next stream in your playback queue
Consider a scenario where a user plays a song, and 15 seconds into playback, the user presses the Next button on the device to skip to the next stream.

Scenario 4: Use voice to navigate to the next stream in your playback queue
In this scenario the user plays a song, and 15 seconds into playback, the user says, "Alexa, next".

Scenario 5: An alarm interrupts music playback
In this scenario, a user asks the device to play music. During playback, an already-set alarm goes off, which the user stops.
The following diagram illustrates the appropriate sequencing of events sent to and directives expected from AVS and highlights the use of channels to direct audio outputs, such as audio playback and alarm management.

Scenario 6: "Alexa, what movies are playing by me?"
In this scenario, a user asks which movies are playing nearby. The following diagram provides the appropriate sequencing of events sent to and directives expected from AVS.

Next steps
- Review the AudioPlayer Interface.
Resources
Last updated: Nov 26, 2020