Send the User a Progressive Response


Your skill can send progressive responses to keep the user engaged while your skill prepares a full response to the user's request. A progressive response is interstitial SSML content (including text-to-speech and short audio) that Alexa plays while waiting for your full skill response.

To send a progressive response, call the Progressive Response API and send a directive with the interstitial content.

When to send progressive responses

You can use progressive responses to do the following tasks:

  • Send text-to-speech confirmations that your skill has received the request and is processing an answer.
  • Play short soundmarks associated with your skill.
  • Provide other engaging content to your users while waiting for the full response.

Progressive responses can also reduce the user's perception of latency in your skill's response.

For example, a skill to look up and book taxi rides might take a few seconds to access an external API to reserve a ride. Instead of remaining silent while the skill processes the request, you can return the following message to let the user know that the skill is working on the request.

User: Alexa, ask Ride Hailer to book a ride to the airport. (Normal IntentRequest sent to the Ride Hailer skill.)

Additional back-and-forth to collect all the information needed to fulfill this intent.
Alexa: OK, please wait while I look up details for your ride… (Progressive response while the skill prepares the full response.)
Alexa: OK, I've reserved your ride. It should arrive at your home in thirty minutes. (Normal response to the IntentRequest)

You can send progressive responses from the context of a current LaunchRequest or IntentRequest. You can't send progressive responses from any other request types (such as AudioPlayer requests).

Steps to send a progressive response

To send a progressive response, call the Progressive Response API and send a directive by taking the following steps:

  1. Get the required data from the incoming request (LaunchRequest or IntentRequest). You need the apiAccessToken and requestId to construct a valid Progressive Response API request.
  2. Call the Progressive Response API and send a directive (such as VoicePlayer.Speak) with the content. The content must be valid SSML wrapped in <speak> tags.
  3. Complete your normal skill processing.
  4. After the progressive response call completes, return your full skill response object. Note that you can't send any more progressive responses after you return the response object.

Progressive responses are only played on the device if they arrive before the Alexa service receives the skill's full response object. For the best user experience, your skill should wait until the progressive response call completes before you send the full response object. The Progressive Response API returns a 204 code after the progressive response is ready to be sent to the device.

Use audio within a progressive response

You can embed short recorded audio within a progressive response with the SSML <audio> tag. The audio can't be any longer than 30 seconds. Note that this period is shorter than the normal audio allowed in the <audio> tag.

For optimal performance, Amazon recommends that you host your MP3 files for SSML responses in close proximity to where your skill is hosted. For example, if the Lambda function for your skill is hosted in the US West (Oregon) region, you will get better performance if you upload your MP3s to a US West (Oregon) S3 bucket.

In addition to using S3 for hosting, Amazon recommends that you use a content delivery network (CDN) such as AWS CloudFront for hosting media assets to prevent throttling under high load.

For additional requirements around the MP3 files when using this tag see <audio>.

Get the required data for the Progressive Response API

Every call to the Progressive Response API requires an apiAccessToken and the requestID for the specific request sent to your skill. You can get both of these fields from the request sent to your skill.

Get the API access token

Use the apiAccessToken provided in the context object. The token is included in all requests sent to your skill.

The context object is included in all requests also. You can access the apiAccessToken in context.System.apiAccessToken. Include the complete value of this property in your call to the Progressive Response API. In the following example, some objects and properties are removed for clarity.

{
  "version": "1.0",
  "session": {},
  "context": {
    "AudioPlayer": {},
    "System": {
      "application": {
        "applicationId": "amzn1.ask.skill.[unique-value-here]"
      },
      "user": {},
      "device": {},
      "apiEndpoint": "https://api.amazonalexa.com",
      "apiAccessToken": "AxThk..."
    }
  },
  "request": {}
}

Get the request identifier

The requestId identifies a specific request sent from Alexa to your skill. This value is included in all requests sent to your skill as part of the request object, such as LaunchRequest or IntentRequest. You can get the requestId from request.requestId. In the following example, some objects and properties are removed for clarity.

{
  "version": "1.0",
  "session": {},
  "context": {},
  "request": {
    "type": "LaunchRequest",
    "requestId": "amzn1.echo-api.request.xxxxxxx",
    "timestamp": "2015-05-13T12:34:56Z",
    "locale": "en-US"
  }
}

Pass the entire value of the request.requestId property to the Progressive Response API.

API endpoint and geographic location of the skill

The endpoint for the Progressive Response API varies depending on the geographic location of your skill. You can get the correct base URL to use from the apiEndpoint value in the System object. That is, the API endpoint is context.System.apiEndpoint.

{
  "version": "1.0",
  "session": {},
  "context": {
    "System": {
      "application": {
        "applicationId": "amzn1.ask.skill.<skill-id>"
      },
      "user": {},
      "apiAccessToken": "AxThk...",
      "apiEndpoint": "https://api.amazonalexa.com"
    }
  },
  "request": {}
}

The examples on this page use the US endpoint (https://api.amazonalexa.com/).

For details about configuring your skill for multiple languages, see Develop Skills in Multiple Languages.

Send a directive

This API call sends the specified directive to Alexa. Currently, the only supported directive is VoicePlayer.Speak. This directive instructs Alexa to speak the provided speech. You must wrap the speech in <speak> tags. You can use SSML tags within the speech as needed.

Your skill can send a maximum of five directive requests for a single user request. Any calls beyond this limit are rejected. Each directive request must be sent in a separate API call.

Endpoint: /v1/directives

Method: POST

directive request

POST https://api.amazonalexa.com/v1/directives HTTP/1.1
Authorization: Bearer AxThk...
Content-Type: application/json

{ 
  "header":{ 
    "requestId":"amzn1.echo-api.request.xxxxxxx"
  },
  "directive":{ 
    "type":"VoicePlayer.Speak",
    "speech":"<speak>This text is spoken while your skill processes the full response.</speak>"
  }
}

Note that the actual token in the Authorization header would be much longer than shown in this example.

Request headers

Header Value Type Required

Authorization

An API access token in the format Bearer apiAccessToken.

Get the apiAccessToken from the context.System.apiAccessToken property in the request sent to your skill.

string

Yes

Content-Type

application/json

string

Yes

Request body

Provide the request body in a JSON object in the following format.

{ 
  "header":{ 
    "requestId":"amzn1.echo-api.request.xxxxxxx"
  },
  "directive":{ 
    "type":"VoicePlayer.Speak",
    "speech":"<speak>This text is spoken while your skill processes the full response.</speak>"
  }
}
Parameter Value Type Required

header.requestId

The requestId for the user's request.

Get the requestId from the request.requestId property included in the request sent to your skill.

The requestId must exactly match the identifier for the LaunchRequest or IntentRequest sent to your skill.

string

Yes

directive.type

The directive type. VoicePlayer.Speak is the only directive supported.

string

Yes

directive.speech

The text that Alexa should speak, wrapped in SSML <speak> tags.

  • The text must be no longer than 600 characters. This value is smaller than the outputSpeech in the response object.
  • The text must be wrapped in <speak> tags, even if you don't use any other SSML tags within the string.
  • If you use the SSML <audio> tag, the audio can't be any longer than 30 seconds. This value is shorter than the normal audio allowed in this tag. For additional requirements around the MP3 files when using this tag, see <audio>.

string

Yes

directive response

If Alexa is able to process and speak the text provided in your directive request, the service returns an HTTP 204 code. The response doesn't include any additional data. Any other status code indicates an error that prevented Alexa from speaking the text.

If a properly formed directive request fails, you can safely resend it to retry. Alexa doesn't play the same content multiple times.

Possible responses

The directive request can return the following codes.

Response Description

204 No Content

Alexa successfully processed the directive.

400 Bad Request

The requestId is either missing from the request or is structurally incorrect.

401 Unauthorized

The authentication token is invalid or doesn't have access to the resource. This code is also returned if the requestId represents a request that's no longer applicable, such as when the progressive response is received after skill's full response. Errors that previously resulted in a 403 error now result in a 401 error.

405 Method Not Allowed

The method isn't supported.

429 Too Many Requests

The skill was throttled due to an excessive number of requests.

500 Internal Error

An unexpected error occurred.


Was this page helpful?

Last updated: Aug 08, 2023