Connect to AVS with HTTP/2

Important: Alexa Voice Service (AVS) developer tools are no longer generally available for Alexa Built-in. Please visit the Works with Alexa program if you are interested in building devices that connect to Alexa.

The Alexa Voice Service (AVS) exposes an HTTP/2 endpoint and supports AVS-initiated directives, which enable your device to access available Alexa features. The following sections show you how to create and maintain an HTTP/2 connection with AVS.

Prerequisites

Before you create an HTTP/2 connection, you must obtain a Login with Amazon (LWA) access token and choose an HTTP/2 client library.

Obtain a Login with Amazon (LWA) access token

To access AVS, your device must obtain a LWA access token, which grants your device access to the AVS APIs on behalf of the user. To learn about the available authorization options and the steps to obtain an LWA access token, see Authorize an AVS Device

You must send the LWA access token to AVS in the header of each event. If authentication fails for any reason the connection with AVS closes. For more details abut structuring this header, see the HTTP/2 Message Syntax Reference.

The following example shows a sample header. In addition to your access token, include a boundary term in the header of each event sent to AVS. The boundary term separates different parts of a multipart message, such as JSON and binary audio. For examples showing how to use a boundary term, see the HTTP/2 Message Syntax Reference.

:method = POST  
:scheme = https  
:path = /{{API version}}/events
authorization = Bearer {{YOUR_ACCESS_TOKEN}}
content-type = multipart/form-data;  boundary={{BOUNDARY_TERM_HERE}}

Note: Each device instance must have a unique deviceSerialNumber, which is passed in scope data during authorization.

Choose an HTTP/2 client library

The following table shows the HTTP/2 client libraries that you can use with AVS.

Language	Library
C / C++	`nghttp2`
C / C++	`curl` and `libcurl`
Java	`OkHttp` For troubleshooting tips, see Troubleshooting the okhttp client library.
Java	`Netty`
Java	`Jetty`

For a complete list of implementations, see GitHub.

Warning: If you use libcurl, your device must make a GET request to /ping every five minutes to maintain the connection. For details, see Ping and Timeout later in this topic.

Open and close event and downchannel streams

This section describes the different expected lifecycles for the event stream and downchannel stream.

Open and close an event stream

The device sends each new event on its own stream. Typically, these streams close after AVS returns the appropriate directives and corresponding audio attachments to your device.

Handle requests sequentially. Send new requests when AVS begins responding to your previous request. AVS begins responding after the previous request returns headers.

The following steps show how an event stream opens and closes:

Your device opens a stream and sends a multipart message consisting of one JSON-formatted event and up to one binary audio attachment. For more details, see Structuring an HTTP/2 Request.
AVS returns multipart messages consisting of one more JSON-formatted directives and corresponding audio attachments on the same stream, potentially before streaming is complete. The URL attribute that follows cid: in a Play or Speak directive also appears in the header of the associated audio attachment.
After receiving a response from AVS, the device closes the event stream.

Note: Your device could receive multiple JSON directives before receiving the corresponding audio attachments. As such, your device should have the logic necessary to pair a directive with its corresponding audio attachment.

Open and close a downchannel stream

In parallel, AVS might send directives to your device on the downchannel. Primarily, use the downchannel for AVS-initiated directives.

The following steps show how a downchannel stream opens and closes:

The device makes a GET request to the directives path within 10 seconds of creating a connection with AVS.
This stream sends your device AVS-initiated directives and audio attachments, such as timers, alarms, and instructions originating from the Amazon Alexa app. Unlike an event stream, the downchannel doesn't instantly close and is should remain open in a half-closed state from the device and from AVS for prolonged periods of time.
When the downchannel stream closes, your device must immediately establish a new downchannel to make sure that your device can receive AVS-initiated directives.

Create an HTTP/2 connection

When a user turns on your device, create a single HTTP2 connection with AVS. This connection handles all directives and events, including anything that AVS sends to your device on the downchannel stream. For more details about connection management, see server-initiated disconnects.

For your device to maintain a connection with AVS, you must meet two requirements:

Create the downchannel stream.
Synchronize the device component states with the appropriate AVS interfaces, such as SpeechRecognizer, AudioPlayer, Alerts, Speaker, SpeechSynthesizer.

Note: RecognizerState is only required if your device uses Cloud-Based Wake Word Verification.

To create an HTTP/2 connection

Create a downchannel stream by having your device make a GET request to the appropriate /{{API version}}/directives within 10 seconds of opening the connection with AVS. The request should look like the following example.
```
:method = GET  
:scheme = https  
:path = /{{API version}}/directives
authorization = Bearer {{YOUR_ACCESS_TOKEN}}   
```
Following a successful request, the downchannel stream remains open in a half-closed state from the device and open from AVS for the duration of the connection. Long pauses could occur between AVS-initiated directives.

To synchronize the states of the device's components with AVS, make a POST request to the appropriate /{{API version}}/events on a new event stream on the existing connection without opening a new connection, and then close the event stream when your device receives a directive response.

The following example shows a SynchronizeState event.

:method = POST  
:scheme = https  
:path = /{{API version}}/events
authorization = Bearer {{YOUR_ACCESS_TOKEN}}
content-type = multipart/form-data; boundary={{BOUNDARY_TERM_HERE}}  

--{{BOUNDARY_TERM_HERE}}
Content-Disposition: form-data; name="metadata"  
Content-Type: application/json; charset=UTF-8  

{  
    "context": [   
       // This is an array of context objects that are used to communicate the
       // state of all device components to Alexa. See Context for details.
    ],  
    "event": {  
        "header": {  
            "namespace": "System",  
            "name": "SynchronizeState",  
            "messageId": "{{STRING}}"  
        },  
        "payload": {  
        }  
    }  
}  

--{{BOUNDARY_TERM_HERE}}--

After synchronizing state, your device can to use this connection to send the following events and receive directives:

Send events to and receive directives from AVS.

Note: Each event and its associated response are sent on a single event stream. When the response is received, the stream should be closed.
Receive directives on the downchannel stream.

Maintain the HTTP/2 connection

Use pings and timeouts to keep the HTTP/2 connection between your device and AVS open. If the server disconnects you, reconnect to AVS by following the instructions in the next section.

Ping and timeout

Your device must perform one of the following actions to prevent the connection from closing:

Send a PING frame to AVS every 5 minutes when the connection is idle.
Make a GET request to /ping every 5 minutes when the connection is idle.

Sample request

:method = GET  
:scheme = https  
:path = /ping  
authorization = Bearer {{YOUR_ACCESS_TOKEN}}

On a failed /ping, close the connection, and create a new connection.

Warning: If you use libcurl, your device must make a GET request to /ping every five minutes to maintain the connection.

Server-initiated disconnects

When the server initiates a disconnect, your device should follow the following sequence:

Open a new connection and route any new requests through it.
Close the old connection after processing all open requests and closing their corresponding streams.
Maintain a connection to any stream URL established before the disconnect occurred, such as Amazon Music or Audible. A stream playing before a server-initiated disconnect occurs should continue to play as long as bytes are available.

If the device's attempt to create a new connection fails, the device should try again with an exponential back-off.

Troubleshooting the OkHttp client library

When you use the OkHttp client library for HTTP2, review the following tips:

The optimal chunk size for streaming voice to the server is 320 bytes.
If your HTTP2 client uses a response timeout or read timeout, set the timeout to 60 minutes or more for a downchannel.
If your HTTP2 client uses pooling or marks connections as idle, don't mark a connection as idle for at least 60 minutes.
The maximum number of concurrent streams allowed by the Amazon server is 10. If you have issues with timeouts, check your configuration. Make sure to close the HTTP2 stream after you get a response. If you don't close HTTP2 streams, after 10 requests, including events, direct connections, and pings, your service receives an error.
The Amazon server forces a disconnect every 60 minutes. You can have different timeout values, but your implementation must handle the requests and connections ending before the timeout.
Most libraries have a timeout applied to the duration that you can attempt to read without getting any data. It's possible to have long periods of time where no data arrives, the timeout expires, and the request ends. Amazon recommends that you don't set a timeout in this scenario. In addition, Amazon recommends that you set the pooling timeout to 60 minutes. You can't turn off the timeout, and, by default, the OkHttp closes an idle connection after five minutes.

Was this page helpful?

Provide feedback

Last updated: Nov 27, 2023