AudioPlayer Interface Reference

Note: Sign in to the developer console to build or publish your skill.

The AudioPlayer interface provides directives and requests for streaming audio and monitoring playback progression. Your skill can send directives to start and stop the playback. The Alexa service sends your skill AudioPlayer requests to give you information about the playback state, such as when the track is nearly finished, or when playback starts and stops. Alexa also sends PlaybackController requests in response to hardware buttons, such as on a remote control or the next and previous tap controls on Alexa-enabled devices with a screen.

For more details about the audio player, see Stream Long-Form Audio with AudioPlayer.

Directives and requests

The AudioPlayer interface includes the following directives responses and request types. You include the directives in a response to Alexa to start and stop an audio stream. Alexa sends the requests to notify your skill about changes to the playback state.

Interface	Description	Type
`AudioPlayer.Play`	Requests Alexa to stream the specified audio file.	Directive
`AudioPlayer.Stop`	Requests Alexa to stop the current audio stream.	Directive
`AudioPlayer.ClearQueue`	Requests Alexa to clears the queue of all audio streams.	Directive
`AudioPlayer.PlaybackStarted`	Notifies your skill that Alexa started the audio stream specified in a `Play` directive. This directive lets your skill verify that playback began successfully.	Request
`AudioPlayer.PlaybackFinished`	Notifies your skill when the stream comes to an end on its own.	Request
`AudioPlayer.PlaybackStopped`	Sent when Alexa stops playing an audio stream in response to a voice request or an `AudioPlayer` directive.	Request
`AudioPlayer.PlaybackNearlyFinished`	Notifies your skill when the current stream is nearly complete and the device is ready to receive a new stream.	Request
`AudioPlayer.PlaybackFailed`	Notifies your skill when an error occurred when attempting to play a stream.	Request

Play directive

Send Alexa a request to stream the audio file identified by the specified audioItem. Use the playBehavior parameter to indicate whether to play the stream immediately or to add the stream to the queue. Add the Play directive in your response to Alexa. Include the directive in the directives array in your response.

Note: You can send only one Play directive in a request.

When you send a Play directive, set the shouldEndSession flag in the response object to true to end the session. If you set this flag to false, Alexa sends the stream to the device for playback, and then pauses the stream to listen for the user's response.

Tip: When you respond to a LaunchRequest or IntentRequest, your response can include both AudioPlayer directives and standard response properties, such as outputSpeech, card, and reprompt. For example, when you include outputSpeech in the same response as a Play directive, Alexa speaks the provided text, and then starts to stream the audio.

Example directive response

The following example shows a directive entry in your response. For the full response format, see Response Format.

{
  "type": "AudioPlayer.Play",
  "playBehavior": "valid playBehavior value such as ENQUEUE",
  "audioItem": {
    "stream": {     
      "url": "://cdn.example.com/url-of-the-stream-to-play",
      "token": "opaque token representing this stream",
      "expectedPreviousToken": "opaque token representing the previous stream",
      "offsetInMilliseconds": 0,
      "captionData":{
         "content": "WEBVTT\n\n00:00.000 --> 00:02.107\n<00:00.006>My <00:00.0192>Audio <00:01.232>Captions.\n",
         "type": "WEBVTT"
      }
   },
    "metadata": {
      "title": "title of the track to display",
      "subtitle": "subtitle of the track to display",
      "art": {
        "sources": [
          {
            "url": "://cdn.example.com/url-of-the-album-art-image.png"
          }
        ]
      },
      "backgroundImage": {
        "sources": [
          {
            "url": "://cdn.example.com/url-of-the-background-image.png"
          }
        ]
      }
    }
  }
}

Directive parameters

Parameter	Description	Type	Required
`type`	Set to `AudioPlayer.Play`.	String	Yes
`playBehavior`	Describes playback behavior. Accepted values: `REPLACE_ALL`: Immediately begin playback of the specified stream, and replace current and enqueued streams. `ENQUEUE`: Add the specified stream to the end of the current queue. This request doesn't impact the currently playing stream. `REPLACE_ENQUEUED`: Replace all streams in the queue. This request doesn't impact the currently playing stream.	String	Yes
`audioItem`	Contains an object providing information about the audio stream to play.	Object	Yes
`audioItem.stream`	Contains an object representing the audio stream to play.	Object	Yes
`audioItem.stream.url`	Identifies the location of audio content at a remote `` location on port 443. For more details, see Audio stream URL requirements.	String	Yes
`audioItem.stream.token`	Opaque token that identifies the audio stream. Maximum size: 1024 characters. Token should not contain Undefined Customer Data (UCD). Use the `token` in the following cases: To identify the stream when you use the `ENQUEUE` behavior. For more details, see Playlist Progression with ENQUEUE To identify the stream for the purposes of displaying `metatdata` on devices with screens. For more details, see Guidelines for images for Alexa-enabled devices with a screen	String	Yes
`audioItem.stream.` `expectedPreviousToken`	An opaque token that represents the expected previous stream. This should match the value of `audioItem.stream.token` for the previous stream. This property is required and allowed only when the `playBehavior` is `ENQUEUE`. This is used to prevent potential race conditions if requests to progress through a playlist and change tracks occur at the same time. For details, see Playlist Progression with ENQUEUE.	String	Yes (when `playBehavior` is `ENQUEUE`)
`audioItem.stream.` `offsetInMilliseconds`	The timestamp in the stream from which Alexa should begin playback. Set to 0 to start playing the stream from the beginning. Set to any other value to start playback from that associated point in the stream.	Long	Yes
`audioItem.stream. captionData`	An object with two fields, `content` and `type`. Use these fields to provide captions for the associated audio attachment on any compatible device with a screen. A `captionData` object doesn't exist until it's provided with `content` and `type`. Devices assert support for AudioPlayer version 1.1 or later through the Capabilities API.	Object	No
`audioItem.stream .captionData.type`	The format of the string in the content field. Supported formats: WEBVTT	String	No
`audioItem.stream .captionData.content`	The time-encoded caption text. Supported formats: WEBVTT	String	No
`audioItem.metadata`	Information about the audio displayed on the Alexa-enabled device with a screen. The information isn't shown in the Alexa app. If you don't include this object, Alexa shows the skill name on a gray background by default. This entire object is optional. However, if you do include `audioItem.metadata`, provide all four `metadata` properties (`title`, `subtitle`, `art`, and `backgroundImage`). Associate each new metadata item with a different `audioItem.stream.token`. For more details, see Guidelines for images for Alexa-enabled devices with a screen. The metadata displays on devices with screens regardless of whether you include the `Display` interface.	Object	No
`audioItem.metadata.title`	The title text to display. This is typically used for the audio track title.	String	No
`audioItem.metadata.subtitle`	Subtitle text to display, such as a category or an artist name.	String	No
`audioItem.metadata.art`	An Image object representing the album art to display. This object uses the same format as images used in the Display interface templates. On the Echo Show or Fire TV Cube, this is the smaller square image. On the Echo Spot, this is cropped into the circle shape and displayed as the background. For best results, follow the image guidelines and specifications.	`Image` object	No
`audioItem.metadata.` `backgroundImage`	An Image object representing the background image to display. This object uses the same format as images used in the Display interface templates. On the Echo Show or Fire TV Cube, `backgroundImage` is the full background image. The image isn't shown on the Echo Spot. For best results, follow the image guidelines and specifications. Note: Background images for `AudioPlayer` skills get an automatic color overlay. The color comes from the skill's album cover color. Currently, there's nothing you can do to remove the color tint.	`Image` object	No

Playlist progression with ENQUEUE

The audioItem.stream.expectedPreviousToken property is required if playBehavior is ENQUEUE to handle situations in which requests to progress through a playlist and change tracks happen at the same time. The value of audioItem.stream.expectedPreviousToken should match the audioItem.stream.token property provided with the previous stream.

For example:

The skill is streaming track 2 in a playlist of several tracks.
The user says "Alexa, go back," which sends an AMAZON.PreviousIntent.
At about the same time, track 2 is nearly finished, so Alexa sends a PlaybackNearlyFinished request.
The skill handles the AMAZON.PreviousIntent first and sends a new Play directive with track 1. This track begins playing. The already-sent PlaybackNearlyFinished request is now outdated, since it assumed that track 2 was playing.
The skill handles the now-outdated PlaybackNearlyFinished request and sends a Play directive with track 3, since this is the next track after the originally playing track 2. This request includes expectedPreviousToken set to track 2.
The expectedPreviousToken provided in the directive doesn't match the token for the actively playing stream, so the device ignores this directive.
As track 1 finishes, Alexa sends a PlaybackNearlyFinished request. The skill responds with a Play directive for track 2. This track begins playing once track 1 finishes.

If this check wasn't in place, the directive sent in step 5 would put track 3 on the queue, which would cause the audio to skip from track 1 to track 3 when track 1 finishes.

Note: Including audioItem.stream.expectedPreviousToken when playBehavior is any other value (REPLACE_ALL or REPLACE_ENQUEUED) causes an error.

Guidelines for images for Alexa-enabled devices with a screen

If you provide images in the audioItem.metadata.art and audioItem.metadata.backgroundImage properties, note the following guidelines:

When you send a track with new metadata, be sure to also change the audioItem.stream.token property for the track.
Your image must meet the requirements for an audio image. For more details, see Image requirements and recommendations.
For the audioItem.metadata.art, use a square image for the best results. If the image isn't square, it's displayed with extra black space on the device. Note that the image is cropped to a circle shape on the Echo Spot.
The Image object lets you provide multiple image URLs in the source array. As with the Display Interface, the device selects the image with the highest resolution to display.
The following properties for a particular image source on the Image object aren't used when displaying the background image and album art for audio and can be left out of the object:
- contentDescription
- size
- widthPixels
- heightPixels

Important: Alexa identifies the metadata for a given audio stream by the audioItem.stream.token included in the Play directive. The Alexa service might cache the metadata associated with a particular audioItem.stream.token for up to five days. As a result, changes to the metadata, such as a different image or a change to the title text, might not reflect on the device immediately. For example, during testing you might notice this behaviour when you experiment with different images or title text for the same audio stream. To clear the cache, send a new Play directive with a different audioItem.stream.token.

Stop directive

Stops the current audio playback. Include the directive in the directives array in your response.

Example directive response

The following example shows a directive entry in your response. For the full response format, see Response Format.

{
  "type": "AudioPlayer.Stop"
}

Directive parameters

Parameter	Description	Type	Required
`type`	Set to `AudioPlayer.Stop`	String	Yes

ClearQueue directive

Clears the audio playback queue. You can set this directive to clear the queue without stopping the currently playing stream, or clear the queue and stop any currently playing stream. Include the directive in the directives array in your response.

Example directive response

The following example shows a directive entry in your response. For the full response format, see Response Format.

{
  "type": "AudioPlayer.ClearQueue",
  "clearBehavior" : "valid clearBehavior value such as CLEAR_ALL"
}

Directive parameters

Parameter Description Type Required

Parameter	Description	Type	Required
`type`	Set to `AudioPlayer.ClearQueue`.	String	Yes
`clearBehavior`	Describes the clear queue behavior. Accepted values: `CLEAR_ENQUEUED`: clears the queue and continues to play the currently playing stream `CLEAR_ALL`: clears the entire playback queue and stops the currently playing stream (if applicable).	String	Yes

type

Set to AudioPlayer.ClearQueue.

String

Yes

clearBehavior

Describes the clear queue behavior. Accepted values:

CLEAR_ENQUEUED: clears the queue and continues to play the currently playing stream
CLEAR_ALL: clears the entire playback queue and stops the currently playing stream (if applicable).

String

Yes

PlaybackStarted request

Sent when Alexa begins playing the audio stream previously sent in a Play directive. This request lets your skill verify that playback started successfully. Also, Alexa sends this request to notify your skill when Alexa resumes playback after pausing it for a voice request.

Important: The request doesn't include the session object because the request isn't sent in the context of a skill session. Use the context object to get details, such as the applicationId and userId.

Example request

{
  "version": "1.0",
  "context": {
    "System": {
      "application": {},
      "user": {},
      "device": {}
    }
  },
  "request": {
    "type": "AudioPlayer.PlaybackStarted",
    "requestId": "unique.id.for.the.request",
    "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z",
    "token": "token representing the currently playing stream",
    "offsetInMilliseconds": 0,
    "locale": "a locale code such as en-US"
  }
}

Request parameters

Parameter	Description	Type
`type`	`AudioPlayer.PlaybackStarted`	String
`requestId`	Represents a unique identifier for the specific request.	String
`timestamp`	Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp].	String
`token`	An opaque token that represents the audio stream. You provide this token when sending the `Play` directive.	String
`offsetInMilliseconds`	Identifies a track's offset in milliseconds when the `PlaybackStarted` request is sent.	Long
`locale`	A `string` indicating the user's locale. For example: `en-US`. See [supported locale codes][service_ref#request-locale].	String

For the full request format, see Request Format.

Response

Your skill can respond to PlaybackStarted with a Stop or ClearQueue directive.

The response cannot include:

Any standard properties such as outputSpeech, card, or reprompt.
Any other AudioPlayer directives.
Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].

Note: Your skill isn't required to return a response to AudioPlayer requests.

PlaybackFinished request

Sent when the stream Alexa is playing comes to an end on its own. If your skill explicitly stops the playback with the Stop directive, Alexa sends PlaybackStopped instead of PlaybackFinished.

Example request

{
  "version": "1.0",
  "context": {
    "System": {
      "application": {},
      "user": {},
      "device": {}
    }
  },
  "request": {
    "type": "AudioPlayer.PlaybackFinished",
    "requestId": "unique.id.for.the.request",
    "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z",
    "token": "token representing the currently playing stream",
    "offsetInMilliseconds": 0,
    "locale": "a locale code such as en-US"
  }
}

Request parameters

Parameter	Description	Type
`type`	`AudioPlayer.PlaybackFinished`	String
`requestId`	Represents a unique identifier for the specific request.	String
`timestamp`	Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp].	String
`token`	An opaque token that represents the audio stream. You provide this token when sending the `Play` directive.	String
`offsetInMilliseconds`	Identifies a track's offset in milliseconds when the `PlaybackFinished` request is sent.	Long
`locale`	A `string` indicating the user's locale. For example: `en-US`. See [supported locale codes][service_ref#request-locale].	String

Response

Your skill can respond to PlaybackFinished with a Stop or ClearQueue directive.

The response cannot include:

Any standard properties such as outputSpeech, card, or reprompt.
Any other AudioPlayer directives.
Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].

Note: Your skill isn't required to return a response to AudioPlayer requests.

PlaybackStopped request

Sent when Alexa stops playing an audio stream in response to one of the following AudioPlayer directives:

Stop
Play with a playBehavior of REPLACE_ALL.
ClearQueue with a clearBehavior of CLEAR_ALL.

This request is also sent if the user makes a voice request to Alexa, since this temporarily pauses the playback. In this case, the playback begins automatically once the voice interaction is complete. If playback stops because the audio stream comes to an end on its own, Alexa sends PlaybackFinished instead of PlaybackStopped.

Example request

{
   "version": "1.0",
  "context": {
    "System": {
      "application": {},
      "user": {},
      "device": {}
    }
  },
  "request": {
    "type": "AudioPlayer.PlaybackStopped",
    "requestId": "unique.id.for.the.request",
    "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z",
    "token": "token representing the currently playing stream",
    "offsetInMilliseconds": 0,
    "locale": "a locale code such as en-US"
  }
}

Request parameters

Parameter	Description	Type
`type`	`AudioPlayer.PlaybackStopped`	String
`requestId`	Represents a unique identifier for the specific request.	String
`timestamp`	Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp].	String
`token`	An opaque token that represents the audio stream. You provide this token when sending the `Play` directive.	String
`offsetInMilliseconds`	Identifies a track's offset in milliseconds when the `PlaybackStopped` request is sent.	Long
`locale`	A `string` indicating the user's locale. For example: `en-US`. See [supported locale codes][service_ref#request-locale].	String

Response

Your skill can't return a response to PlaybackStopped.

PlaybackNearlyFinished request

Sent when the device is ready to add the next stream to the queue.

To progress through a playlist of audio streams, respond to this request with a Play directive for the next stream and set playBehavior to ENQUEUE or REPLACE_ENQUEUED. This adds the new stream to the queue without stopping the current playback. Alexa begins streaming the new audio item once the currently playing track finishes.

Example request

{
   "version": "1.0",
  "context": {
    "System": {
      "application": {},
      "user": {},
      "device": {}
    }
  },
  "request": {
    "type": "AudioPlayer.PlaybackNearlyFinished",
    "requestId": "unique.id.for.the.request",
    "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z",
    "token": "token representing the currently playing stream",
    "offsetInMilliseconds": 0,
    "locale": "a locale code such as en-US"
  }
}

Request parameters

Parameter	Description	Type
`type`	`AudioPlayer.PlaybackNearlyFinished`	String
`requestId`	Represents a unique identifier for the specific request.	String
`timestamp`	Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp].	String
`token`	An opaque token that represents the audio stream that is currently playing. You provide this token when sending the `Play` directive.	String
`offsetInMilliseconds`	Identifies a track's offset in milliseconds when the `PlaybackNearlyFinished` request is sent.	Long
`locale`	A `string` indicating the user's locale. For example: `en-US`. See [supported locale codes][service_ref#request-locale].	String

Response

Your skill can respond to PlaybackNearlyFinished with any AudioPlayer directive.

The response cannot include:

Any standard properties such as outputSpeech, card, or reprompt.
Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].

Note: Your skill isn't required to return a response to AudioPlayer requests.

PlaybackFailed request

Sent when Alexa encounters an error when attempting to play a stream.

This request type includes two token properties – one as a property of the request object, and one as a property of the currentPlaybackState object. The request.token property represents the stream that failed to play. The currentPlaybackState.token property can be different if Alexa is playing a stream and the error occurs when attempting to buffer the next stream on the queue. In this case, currentPlaybackState.token represents the stream that was successfully playing.

Example request

{
   "version": "1.0",
  "context": {
    "System": {
      "application": {},
      "user": {},
      "device": {}
    }
  },
  "request": {
    "type": "AudioPlayer.PlaybackFailed",
    "requestId": "unique.id.for.the.request",
    "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z",
    "token": "token representing the currently playing stream",
    "offsetInMilliseconds": 0,
    "locale": "a locale code such as en-US",
    "error": {
      "type": "error code",
      "message": "description of the error that occurred"
    },
    "currentPlaybackState": {
      "token": "token representing stream playing when error occurred",
      "offsetInMilliseconds": 0,
      "playerActivity": "player state when error occurred, such as PLAYING"
    }
  }
}

Request parameters

Parameter	Description	Type
`type`	`AudioPlayer.PlaybackFailed`	String
`requestId`	Represents a unique identifier for the specific request.	String
`timestamp`	Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp].	String
`token`	An opaque token provided by the `Play` directive that represents the stream that failed to play.	String
`locale`	A `string` indicating the user's locale. For example: `en-US`. See [supported locale codes][service_ref#request-locale].	String
`error`	Contains an object with error information	Object
`error.type`	Identifies the specific type of error. For details about each error type, see Playback errors.	String
`error.message`	A description of the error the device has encountered.	String
`currentPlaybackState`	Contains an object providing details about the playback activity occurring at the time of the error.	Object
`currentPlaybackState.` `token`	An opaque token that represents the audio stream currently playing when the error occurred. Note that this may be different from the value of the `request.token` property.	String
`currentPlaybackState.` `offsetInMilliseconds`	Identifies a track's offset in milliseconds when the error occurred.	Long
`currentPlaybackState.` `playerActivity`	Identifies the player state when the error occurred: `PLAYING`, `PAUSED`, `FINISHED`, `BUFFER_UNDERRUN`, or `IDLE`.	String

Error Type	Description
`MEDIA_ERROR_UNKNOWN`	An unknown error occurred.
`MEDIA_ERROR_INVALID_REQUEST`	The request is malformed, unauthorized, forbidden, or not found.
`MEDIA_ERROR_SERVICE_UNAVAILABLE`	Alexa was unable to reach the URL for the stream.
`MEDIA_ERROR_INTERNAL_SERVER_ERROR`	Alexa accepted the request, but was unable to process the request as expected.
`MEDIA_ERROR_INTERNAL_DEVICE_ERROR`	There was an internal error on the device.

Response

Your skill can respond to PlaybackFailed with any AudioPlayer directive.

The response cannot include:

Any standard properties such as outputSpeech, card, or reprompt.
Any other directives from other interfaces, such a [Dialog directive][dialog-interface-reference#directives].

Note: Your skill isn't required to return a response to AudioPlayer requests.

System.ExceptionEncountered request

If a response to an AudioPlayer request causes an error, Alexa sends your skill a System.ExceptionEncountered request. Alexa ignores any directives included in the response to this request.

Example request

{
  "type": "System.ExceptionEncountered",
  "requestId": "unique.id.for.the.request",
  "timestamp": "timestamp of request in format: 2018-04-11T15:15:25Z",
  "locale": "a locale code such as en-US",
  "error": {
    "type": "error code such as INVALID_RESPONSE",
    "message": "description of the error that occurred"
  },
  "cause": {
    "requestId": "unique identifier for the request that caused the error"
  }
}

Request parameters

Parameter	Description	Type
`type`	`System.ExceptionEncountered`	`string`
`requestId`	Represents a unique identifier for the specific request.	`string`
`timestamp`	Provides the date and time when Alexa sent the request as an ISO 8601 formatted string. Used to [verify the request when hosting your skill as a web service][hosting-as-web-service#timestamp].	`string`
`locale`	A `string` indicating the user's locale. For example: `en-US`. See [supported locale codes][service_ref#request-locale].	`string`
`error`	Contains an object with error information	`object`
`error.type`	Identifies the specific type of error (`INVALID_RESPONSE`, `DEVICE_COMMUNICATION_ERROR`, `INTERNAL_ERROR`).	`string`
`error.message`	A description of the error the device has encountered.	`string`
`cause.requestId`	The `requestId` for the request that caused the error	`string`

Response

Your skill can't return a response to System.ExceptionEncountered request.

Was this page helpful?

Provide feedback

Last updated: Jan 19, 2024

AudioPlayer Interface Reference

Directives and requests

Play directive

Example directive response

Directive parameters

Playlist progression with ENQUEUE

Guidelines for images for Alexa-enabled devices with a screen

Stop directive

Example directive response

Directive parameters

ClearQueue directive

Example directive response

Directive parameters

PlaybackStarted request

Example request

Request parameters

Response

PlaybackFinished request

Example request

Request parameters

Response

PlaybackStopped request

Example request

Request parameters

Response

PlaybackNearlyFinished request

Example request

Request parameters

Response

PlaybackFailed request

Example request

Request parameters

Response

System.ExceptionEncountered request

Example request

Request parameters

Response

Related topics

Was this page helpful?