Request and Response JSON Reference


The Alexa Skills Kit enables you to give Alexa new abilities by building a cloud-based service. This service can be either a web service or an AWS Lambda function. This document details the protocol interface between the Alexa service and the web service or Lambda function you create. AWS Lambda is a service offering by Amazon Web Services.

Alexa communicates with your service via a request-response mechanism using HTTP over SSL/TLS. When a user interacts with an Alexa skill, your service receives a POST request containing a JSON body. The request body contains the properties necessary for the service to perform its logic and generate a JSON-formatted response.

Request format

This section documents the format for the requests sent to your service.

HTTP header

POST / HTTP/1.1
Content-Type : application/json;charset=UTF-8
Host : your.application.endpoint
Content-Length :
Accept : application/json
Accept-Charset : utf-8
Signature:
SignatureCertChainUrl: https://s3.amazonaws.com/echo.api/echo-api-cert.pem

Request body syntax

The request body sent to your service is in JSON format. In this example, the AudioPlayer directive has been requested, but the Display.RenderTemplate and VideoApp.Launch directives have not been requested. And, the example includes the Advertising property.

{
  "version": "1.0",
  "session": {
      "new": true,
      "sessionId": "amzn1.echo-api.session.[unique-value-here]",
      "application": {
          "applicationId": "amzn1.ask.skill.[unique-value-here]"
      },
      "attributes": {
          "key": "string value"
      },
      "user": {
          "userId": "amzn1.ask.account.[unique-value-here]",
          "accessToken": "Atza|AAAAAAAA..."
      }
  },
  "context": {
      "System": {
          "device": {
              "deviceId": "string",
              "supportedInterfaces": {
                  "AudioPlayer": {}
              },
              "persistentEndpointId": "amzn1.alexa.endpoint.[unique-value-here]"
          },
          "application": {
              "applicationId": "amzn1.ask.skill.[unique-value-here]"
          },
          "user": {
              "userId": "amzn1.ask.account.[unique-value-here]",
              "accessToken": "Atza|AAAAAAAA..."
          },
          "person": {
              "personId": "amzn1.ask.person.[unique-value-here]",
              "accessToken": "Atza|BBBBBBB..."
          },
          "unit": {
              "unitId": "amzn1.ask.unit.[unique-value-here]",
              "persistentUnitId": "amzn1.alexa.unit.did.[unique-value-here]"
          },
          "apiEndpoint": "https://api.amazonalexa.com",
          "apiAccessToken": "AxThk..."
      },
      "Advertising": {
          "advertisingId": "296D263C-87BC-86A3-18A7-D307393B83A9",
          "limitAdTracking": false
      },
      "AudioPlayer": {
          "playerActivity": "PLAYING",
          "token": "audioplayer-token",
          "offsetInMilliseconds": 0
      }
  },
  "request": {}
}

Request body properties

All requests include the version, context, and request objects at the top level. The session object is included for all standard requests, but it is not included for AudioPlayer, VideoApp, or PlaybackController requests.

Property Description Type

version

Version specifier for the request with the value defined as: "1.0"

string

session

Provides additional context associated with the request.

For the definition of the session object, see Session Object.

object

context

Provides your skill with information about the current state of the Alexa service and device at the time the request is sent to your service. This is included on all requests. For requests sent in the context of a session (LaunchRequest and IntentRequest), the context object duplicates the user and application information that is also available in the session.

For the definition of the context object, see Context Object.

object

request

Provides the details of the user's request. There are several different request types available, see:

Standard Requests:

Requests associated with a specific interface:

object

Request locale

Every request object includes a locale property. This is a string code indicating the user's locale, such as en-US for English (US). Use this to determine the language in which your skill should respond.

Supported locales:

Locale code Language

ar-SA

Arabic (SA)

de-DE

German (DE)

en-AU

English (AU)

en-CA

English (CA)

en-GB

English (UK)

en-IN

English (IN)

en-US

English (US)

es-ES

Spanish (ES)

es-MX

Spanish (MX)

es-US

Spanish (US)

fr-CA

French (CA)

fr-FR

French (FR)

hi-IN

Hindi (IN)

it-IT

Italian (IT)

ja-JP

Japanese (JP)

pt-BR

Portuguese (BR)

For more about supporting multiple languages, see Develop Skills in Multiple Languages.

Session object

Standard request types (CanFulfillIntentReqeuest, LaunchRequest, IntentRequest, and SessionEndedRequest) include the session object. The [GameEngine interface][game-engine-interface-reference] includes a session object also.

Requests from interfaces such as AudioPlayer and PlaybackController are not sent in the context of a session, so they do not include the session object. The context.System.user and context.System.application objects provide the same user and application information as the same objects within session – see Context object.

Property Description Type

new

Indicates whether this is a new session. Returns true for a new session or false for an existing session.

boolean

sessionId

Represents a unique identifier per a user's active session.

string

attributes

Map of key-value pairs. The attributes map is empty for requests where a new session has started with the property new set to true.

  • The key is a string that represents the name of the attribute. Type: string
  • The value is an object that represents the value of the attribute. Type: object

When returning your response, you can include data you need to persist during the session in the sessionAttributes property. The attributes you provide are then passed back to your skill on the next request.

map

application

Contains an application ID. This is used to verify that the request was intended for your service:

  • applicationId: A string representing the application ID for your skill.

This information is also available in the context.System.application property.

To see the application ID for your skill, navigate to the list of skills and click the View Skill ID link for the skill.

object

user

Describes the Amazon account for which the skill is enabled. The user object is different than the person object, because user refers to the Amazon account for which the skill is enabled, whereas person refers to a user whom Alexa recognizes by voice.

A user is composed of:

  • userId: A string that represents a unique identifier for the Amazon account for which the skill is enabled. The length of this identifier can vary, and there's no restriction on the number of characters it can have. Alexa automatically generates the userId when a user enables the skill in the Alexa app. Normally, disabling and re-enabling a skill generates a new identifier. However, if the skill offers consumable purchases, the userId is not reset. See Maintain the user inventory if the user disables and re-enables the skill.

  • accessToken: A token identifying the user in another system. This token is only provided if the user has successfully linked their account. See Understand Account Linking for more details.

  • permissions: Deprecated. An object that contains a consentToken allowing the skill access to information that the customer has consented to provide, such as address information. Because consentToken is deprecated, instead use the apiAccessToken available in the context object to determine the user's permissions. See Permissions for more details.

The accessToken field does not appear if null, and the permissions object also does not appear if consentToken is null.

This information is also available in the context.System.application property.

object

Context object

The context object provides your skill with information about the current state of the Alexa service and device at the time the request is sent to your service. This is included on all requests. For requests sent in the context of a session (CanFulfillIntentRequest, LaunchRequest and IntentRequest), the context object duplicates the user and application information that is also available in the session object.

Property Description Type

Advertising

(Optional) Provides the customer's advertising ID and preference for receiving interest-based ads. Included in requests to skills that declare that the skill delivers advertising.

For the definition of the Advertising object, see Advertising.

object

Alexa.Presentation.APL

Provides information about the Alexa Presentation Language document currently displayed on the screen. Included in the request when the user's device has a screen and the screen is displaying an APL document your skill sent with the RenderDocument directive.

For details about the data provided in this property, see APL Visual Context in the Skill Request. For details about APL, see Add Visuals and Audio to Your Skill.

object

AudioPlayer

Provides the current state for the AudioPlayer interface. For the definition of the AudioPlayer object, see AudioPlayer Object.

Note that AudioPlayer is included on all customer-initiated requests (such as requests made by voice or using a remote control), but includes the details about the playback (token and offsetInMilliseconds) only when sent to a skill that was most recently playing audio.

object

System

Provides information about the current state of the Alexa service and the device interacting with your skill.

For the definition of the system object, see System object.

object

Viewport

Included when the user's device has a screen. The Viewport object provides information about the viewport, such as its size and shape. For details, see Viewport object in the skill request in the Alexa.Presentation.APL Interface Reference.

object

Viewports

Included when the user's device has a screen or a character display. Contains objects that provide information about the screens or displays available. For details, see Viewport object in the skill request in the Alexa.Presentation.APLT Interface Reference.

array

System object

Property Description Type

apiAccessToken

Contains a token that can be used to access Alexa-specific APIs. This token encapsulates:

This token is included in all requests sent to your skill. When using this token to access an API that requires permissions, your skill should call the API and check the return code. If a 403 (access denied) code is returned, your skill can then take appropriate actions to request the permissions from the user.

string

apiEndpoint

References the correct base URI to refer to by region, for use with APIs such as the Device Location API and Progressive Response API.

string

application

Contains an application ID. Use the ID to verify that the request was intended for your service:

  • applicationId: A string representing the application ID for your skill.

This information is also available in the session.application property for CanFulfillIntentRequest, LaunchRequest, IntentRequest, and SessionEndedRequest types.

The application ID is displayed in the developer console. You can see it when you pick AWS Lambda ARN on the Custom > Endpoint page. It is also shown below the skill name in your list of skills.

object

device

Provides information about the device used to send the request. The device object contains the following properties:

  • The deviceId property uniquely identifies the device.
  • The supportedInterfaces property lists each interface that the device supports. For example, if supportedInterfaces includes AudioPlayer {}, then you know that the device supports streaming audio using the AudioPlayer interface.
  • The persistentEndpointId property is a persistent identifier for the endpoint from which the skill request is issued. An endpoint represents an Alexa-connected device (such as an Echo device or a smart light bulb) or a device-like entity (such as a scene or a specific application running on a device). For details about endpoints, see Endpoint API. Only registered Alexa Smart Properties for residential and Alexa Smart Properties for hospitality vendors can see the Read PersistentEndpointId toggle in the Alexa developer console. This identifier is vendor-based and all skills that belong to particular vendor share this identifier.

object

unit

Represents a logical construct organizing actors (such as people and organizations) and resources (such as devices and skills) that interact with Alexa systems.

A unit is composed of:

  • unitId: A string that represents a unique identifier for the unit in the context of a request. The length of this identifier can vary, and there's no restriction on the number of characters it can have. Alexa generates this string only when a request made to your skill has a valid unit context. Normally, disabling and re-enabling a skill generates a new identifier.

  • persistentUnitId: A string that represents a unique identifier for the unit in the context of a request. The length of this identifier can vary, but is never more than 255 characters. Alexa generates this string only when the request made to your skill has a valid unit context. This is another unit identifier associated with an organization's developer account. Only registered Alexa Smart Properties for residential and Alexa Smart Properties for hospitality vendors can see the Read PersistentUnitId toggle in the Alexa skills developers console. This identifier is vendor-based and all skills that belong to particular vendor share this identifier.

object

person

Describes the person who is making the request to Alexa. The person object is different than the user object, because person refers to a user whom Alexa recognizes by voice, whereas user refers to the Amazon account for which the skill is enabled.

A person is composed of:

  • personId: A string that represents a unique identifier for the person who is making the request. The length of this identifier can vary, and there's no restriction on the number of characters it can have. Alexa generates this string when a recognized speaker makes a request to your skill. Normally, disabling and re-enabling a skill generates a new identifier.

  • accessToken: A token identifying the person in another system. This field only appears in the request if the person has successfully linked their account with their Alexa profile.

The accessToken field will not appear if null.

object

user

Describes the Amazon account for which the skill is enabled. The user object is different than the person object, because user refers to the Amazon account for which the skill is enabled, whereas person refers to a user whom Alexa recognizes by voice.

A user is composed of:

  • userId: A string that represents a unique identifier for the Amazon account for which the skill is enabled. The length of this identifier can vary, and there's no restriction on the number of characters it can have. Alexa automatically generates the userId when a user enables the skill in the Alexa app. Normally, disabling and re-enabling a skill generates a new identifier. However, if the skill offers consumable purchases, the userId is not reset. See Maintain the user inventory if the user disables and re-enables the skill.

  • accessToken: A token identifying the user in another system. This token is only provided if the user has successfully linked their account. See Understand Account Linking for more details.

  • permissions: Deprecated. An object that contains a consentToken allowing the skill access to information that the customer has consented to provide, such as address information. Because consentToken is deprecated, instead use the apiAccessToken available in the context object to determine the user's permissions. See Permissions for more details.

The accessToken field does not appear if null, and the permissions object also does not appear if consentToken is null.

This information is also available in the session.user property for LaunchRequest, IntentRequest, and SessionEndedRequest types.

object

Advertising object

The Advertising object provides the customer's advertising ID and preference for receiving interest-based ads. Alexa includes the Advertising object in requests to custom skills that declare that the skill delivers advertising. For more details, see About Alexa Advertising ID.

Property Description Type Required

advertisingId

Customer-resettable, unique identifier that maps to the ifa attribute of the OpenRTB API specification.

Formatted as a version 4 UUID string separated by dashes (8-4-4-4-12).
Example: E0DE19C7-43A8-4738-AfA7-3A7f1B3C0367

String

Yes

limitAdTracking

Indicates whether the customer wants to receive interest-based ads. Set to true when the customer opts out of interest-based ads and tracking.
The limitAdTracking property maps to the lmt attribute of the OpenRTB API specification.

Boolean

Yes

Examples

The following examples show settings that indicate that the customer opted out of interest-based ads and tracking.

"Advertising": {
    "advertisingId": "8D5E212-165B-4CA0-909B-C86B9CEE0111",
    "limitAdTracking": true
}

"Advertising": {
    "advertisingId": "00000000-0000-0000-0000-00000000",
    "limitAdTracking": true
}

The following example shows settings that indicate that the customer opted in to interest-based ads and tracking.

"Advertising": {
    "advertisingId": "8D5E212-165B-4CA0-909B-C86B9CEE0111",
    "limitAdTracking": false
}

AudioPlayer object

This object provides the current state for the AudioPlayer interface.

AudioPlayer is included on all customer-initiated requests (such as requests made by voice or using a remote control), but includes the details about the playback (token and offsetInMilliseconds) only when sent to a skill that was most recently playing audio.

Requests that are not customer-initiated, such as the AudioPlayer requests do not include the AudioPlayer object in the context. For these requests, the request type indicates the current state (for example, the request AudioPlayer.PlaybackStarted indicates that the playback has started) and details about the state are part of the request object.

Property Description Type

token

An opaque token that represents the audio stream described by this AudioPlayer object. You provide this token when sending the Play directive. This is only included in the AudioPlayer object when your skill was the skill most recently playing audio on the device.

string

offsetInMilliseconds

Identifies a track's offset in milliseconds at the time the request was sent. This is 0 if the track is at the beginning. This is only included in the AudioPlayer object when your skill was the skill most recently playing audio on the device.

long

playerActivity

playerActivity: Indicates the last known state of audio playback:

  • IDLE: Nothing was playing, no enqueued items.
  • PAUSED: Stream was paused.
  • PLAYING: Stream was playing.
  • BUFFER_UNDERRUN: Buffer underrun
  • FINISHED: Stream was finished playing.
  • STOPPED: Stream was interrupted.

string

Response format

This section documents the format of the response that your service returns. The service for an Alexa skill must send its response in JSON format.

Note the following size limitations for the response:

  • The outputSpeech response can't exceed 8000 characters.
  • All of the text included in a card can't exceed 8000 characters. This includes the title, content, text, and image URLs.
  • An image URL (smallImageUrl or largeImageUrl) can't exceed 2000 characters.
  • When using the <audio> SSML tag:
    • The combined total time for all audio files in the outputSpeech property of the response can't be more than 240 seconds.
    • The combined total time for all audio files in the reprompt property of the response can't be more than 90 seconds.
  • The token included in an audioItem.stream for the AudioPlayer.Play directive can't exceed 1024 characters.
  • The url included in an audioItem.stream for the AudioPlayer.Play directive can't exceed 8000 characters.
  • The payload of a CustomInterfaceController.SendDirective directive can't exceed 1000 bytes. For details about this directive, see Respond to Alexa with a directive targeted to a gadget.
  • The total size of your response can't exceed 120 KB.

If your response exceeds these limits, the Alexa service returns an error.

HTTP header

HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Content-Length:

Response body syntax

{
  "version": "string",
  "sessionAttributes": {
    "key": "value"
  },
  "response": {
    "outputSpeech": {
      "type": "PlainText",
      "text": "Plain text string to speak",
      "playBehavior": "REPLACE_ENQUEUED"      
    },
    "card": {
      "type": "Standard",
      "title": "Title of the card",
      "text": "Text content for a standard card",
      "image": {
        "smallImageUrl": "https://url-to-small-card-image...",
        "largeImageUrl": "https://url-to-large-card-image..."
      }
    },
    "reprompt": {
      "outputSpeech": {
        "type": "PlainText",
        "text": "Plain text string to speak",
        "playBehavior": "REPLACE_ENQUEUED"             
      }
    },
    "directives": [
      {
        "type": "InterfaceName.Directive"
        (...properties depend on the directive type)
      }
    ],
    "shouldEndSession": true
  }
}

Response properties

Propert Description Type Required

version

Version specifier for the response with the value to be defined as: "1.0"

string

Yes

sessionAttributes

Map of key-value pairs to persist in the session.

  • The key is a string that represents the name of the attribute. Type: string.
  • The value is an object that represents the value of the attribute. Type: object.

Session attributes are ignored if included in a response to an AudioPlayer or PlaybackController request.

map

No

response

Defines what to render to the user and whether to end the current session.

Response object

Yes

Response object

Property Description Type Required

outputSpeech

Speech to render to the user. See OutputSpeech Object.

object

No

card

Card to render to the Amazon Alexa App. See Card Object.

object

No

reprompt

OutputSpeech to use if a re-prompt is necessary.

Used if your service keeps the session open after sending the response (shouldEndSession is false), but the user doesn't respond with anything that maps to an intent defined in your voice interface while the microphone is open. The user has a few seconds to respond to the reprompt before Alexa closes the session.

If the reprompt doesn't have a value, Alexa doesn't reprompt the user.

object

No

shouldEndSession

Indicates what should happen after Alexa speaks the response:

  • true: The session ends.
  • false or null: Alexa opens the microphone for a few seconds to listen for the user's response. Include a reprompt to give the user a second chance to respond.
  • undefined: The session's behavior depends on the type of Echo device. If the device has a screen and the skill displays screen content, the session stays open for up to 30 more seconds, without opening the microphone to prompt the user for input. For details, see How devices with screens affect the skill session. If the user speaks and precedes their request with the wake word (such as "Alexa,") Alexa sends the request to the skill. Otherwise, Alexa ignores the user's speech. If an Alexa Gadgets event handler is active, the session continues to stay open until the skill calls CustomInterfaceController.StopEventHandler or the event handler expires.

Responses to AMAZON.StopIntent must use true.

Boolean

No

directives

List of directives specifying device-level actions to take using a particular interface, such as the AudioPlayer interface for streaming audio. For details about the directives you can include in your response, see:

array

No

OutputSpeech object

This object is used for setting both the outputSpeech and the reprompt properties.

This object can only be included when sending a response to a CanFulfillIntentRequest, LaunchRequest, IntentRequest, Display.ElementSelected request or an InputHandlerEvent.

Property Description Type Required

type

Type of output speech to render.
Valid values:

  • "PlainText": Indicates that the output speech is defined as plain text.
  • "SSML": Indicates that the output speech is text marked up with SSML.

string

Yes

text

Speech to render to the user. Use this property when type is "PlainText"

string

Yes (for PlainText)

ssml

Text marked up with SSML to render to the user. Use this when type is "SSML"

string

Yes (for SSML)

playBehavior

Determines the queuing and playback of this output speech.
Valid values:

  • "ENQUEUE": Add this speech to the end of the queue. Do not interrupt Alexa's current speech. This is the default value for all skills.
  • "REPLACE_ALL": Immediately begin playback of this speech, and replace any current and enqueued speech.
  • "REPLACE_ENQUEUED": Replace all speech in the queue with this speech. Do not interrupt Alexa's current speech.

string

No

Card object

This object can only be included when sending a response to a CanFulfillIntentRequest, LaunchRequest, IntentRequest, or InputHandlerEvent.

Property Description Type Required

type

Describes the type of card to render. Valid types are:

string

Yes

title

Title of the card. (not applicable for cards of type LinkAccount).

string

No

content

Contents of a Simple card (not applicable for cards of type Standard or LinkAccount).

string

No

text

Text content for a Standard card (not applicable for cards of type Simple or LinkAccount)

string

No

image

Image object that specifies the URLs for the image to display on a Standard card. Only applicable for Standard cards.

You can provide two URLs, for use on different sized screens.

  • smallImageUrl
  • largeImageUrl

See Including a Card in Your Skill's Response.

object

No

Reprompt object

The reprompt object is valid when sending a response to a CanFulfillIntentRequest, LaunchRequest, or IntentRequest.

Alexa speaks the reprompt when shouldEndSession is false and the user doesn't respond within a few seconds.

Property Description Type Required

outputSpeech

OutputSpeech object containing the text or SSML to render as a re-prompt.

object

No

directives

List of directives specifying device-level actions to take using a particular interface. Within a reprompt object, you can include the following directives:

No other directives are supported within the reprompt object. Including any other directives in the directives array for a reprompt causes an error and ends the skill session.

array

No

When both outputSpeech and directives are included, Alexa speaks the outputSpeech first and then plays the audio generated by the Alexa.Presentation.APLA.RenderDocument directive. When the directives array contains multiple Alexa.Presentation.APLA.RenderDocument directives, they play in array order.

Errors

InternalServerError

  • An error occurred while handling a request within your service.
  • HTTP Status Code: 500

Response examples

Standard response to CanFulfillIntentRequest, LaunchRequest or IntentRequest example

This response does not use any interfaces (such as AudioPlayer), so it returns the standard response properties (outputSpeech, card, reprompt, and shouldEndSession). CanFulfillIntent includes extra properties specific to that response.

{
  "version": "1.0",
  "sessionAttributes": {
    "supportedHoroscopePeriods": {
      "daily": true,
      "weekly": false,
      "monthly": false
    }
  },
  "response": {
    "outputSpeech": {
      "type": "PlainText",
      "text": "Today will provide you a new learning opportunity.  Stick with it and the possibilities will be endless. Can I help you with anything else?"
    },
    "card": {
      "type": "Simple",
      "title": "Horoscope",
      "content": "Today will provide you a new learning opportunity.  Stick with it and the possibilities will be endless."
    },
    "reprompt": {
      "outputSpeech": {
        "type": "PlainText",
        "text": "Can I help you with anything else?"
      }
    },
    "shouldEndSession": false
  }
}

See also: CanFulfillIntent Response to CanFulfillIntentRequest

Response with an Alexa Presentation Language (APL) directive

See the RenderDocument directive example in the Alexa.Presentation.APL Interface Reference.

Response to IntentRequest or Launch Request with Directives Example

This response includes AudioPlayer interface directives. In this example, Alexa would speak the provided outputSpeech text before beginning the audio playback.

Note that this example shows a response sent from a LaunchRequest or IntentRequest. A response returned from AudioPlayer or PlaybackController could not include the outputSpeech, card, reprompt, or shouldEndSession properties.

{
  "version": "1.0",
  "sessionAttributes": {},
  "response": {
    "outputSpeech": {
      "type": "PlainText",
      "text": "Playing the requested song."
    },
    "card": {
      "type": "Simple",
      "title": "Play Audio",
      "content": "Playing the requested song."
    },
    "reprompt": {
      "outputSpeech": {
        "type": "PlainText",
        "text": null
      }
    },
    "directives": [
      {
        "type": "AudioPlayer.Play",
        "playBehavior": "ENQUEUE",
        "audioItem": {
          "stream": {
            "token": "this-is-the-audio-token",
            "url": "https://my-audio-hosting-site.com/audio/sample-song.mp3",
            "offsetInMilliseconds": 0
          }
        }
      }
    ],
    "shouldEndSession": true
  }
}

Response to AudioPlayer or PlaybackController example (directives only)

This is an example of a response sent from an AudioPlayer or PlaybackController request (such as a PlaybackController.NextCommandIssued request sent when the user pressed the Next button on a remote), so it cannot include the outputSpeech, card, reprompt, or shouldEndSession properties.

{
  "version": "1.0",
  "response": {
    "directives": [
      {
        "type": "AudioPlayer.Play",
        "playBehavior": "REPLACE_ALL",
        "audioItem": {
          "stream": {
            "token": "track2-long-audio",
            "url": "https://my-audio-hosting-site.com/audio/sample-song-2.mp3",
            "offsetInMilliseconds": 0
          }
        }
      }
    ]
  }
}

Service Interface Reference (JSON)

Request Format and Standard Request Types:

Interfaces:


Was this page helpful?

Last updated: Nov 28, 2023