API Mechanics
- Topics
- Capabilities
- Capability and Envelope Versions
- MQTT Publishing Rate and Retry Strategy
- Encryption
- Common Header
- Data Types
- Getting Started: First Boot
Topics
Terminology
Where the device receives messages from AIA, such as on the directive
and speaker
topics, the device must subscribe to the relevant topics.
Where the device sends messages to AIA, such as on the event
and microphone
topics, the device publishes on the relevant topics.
Throughout this documentation, general use of a topic for either receiving or sending messages is described as participating in that topic.
Message Planes
The MQTT topic hierarchy below can be thought of as supporting two separate message planes:
- a data plane, which uses binary messages
- a control plane, which uses JSON messages
Control messages often need to communicate that an action must be taken at a certain location in the corresponding data stream. This communication is done through the use of a binary offset. Audio Data messages (as in the Speaker and Microphone capabilities) define a binary offset field in their Audio Stream Header that indicates the position of the message in the binary stream. Control messages that need to take effect at specific locations in a data stream will include the binary offset for the relevant stream.
There are two types of control messages:
- directives, which allow AIA to control the device and data streaming to it
- events, which allow the device to inform AIA of local activities, states, and changes
See the Event-Directive Control Plane documentation to understand the structure of event and directive messages.
Hierarchy

MQTT topics through AWS IoT have a directory-like nested structure. All messages, binary and JSON, are exchanged between device and AIA in leaf topics.
For AIA generally, the relative MQTT root is $aws/alexa/ais
. The current AIA envelope version, represented as <envelopeVersion>
, is v1
. This defines the remaining MQTT topic hierarchy. You do not have to hard-code this value, as AIA will return this value in the iot.topicRoot
field in the registration flow.
In the v1
envelope, all MQTT topics used with a specific device are under <clientId>
, which is the client ID of the device, as registered with AWS IoT and as communicated to AIA through the registration flow in the iot.clientId
field.
Therefore, for a given device communicating with AIA, all MQTT topics are relative to $aws/alexa/ais/v1/<clientId>
:
Topic | Data and Encryption Type | Description |
---|---|---|
connection |
Connection management between devices and AIA uses the parent connection topic. |
|
connection/ fromclient |
Unencrypted1 JSON using IAM permissions |
The device sends connection management messages, such as Connect , on the fromclient topic. |
connection/ fromservice |
Unencrypted1 JSON using IAM permissions |
AIA sends connection management messages, such as Acknowledge , on the fromservice topic. |
capabilities |
The device's assertion of capabilities uses the parent capabilities topic. |
|
capabilities/ publish |
AIA-encrypted JSON | The device sends the Publish capability assertion message on the publish topic. |
capabilities/ acknowledge |
AIA-encrypted JSON | AIA acknowledges the device's assertion of capabilities through the Acknowledge message on the acknowledge topic. |
directive |
AIA-encrypted JSON | When a capability interface defines a directive and the device asserts support for that capability, AIA will send directive messages on the directive topic. |
event |
AIA-encrypted JSON | When a capability interface defines an event and the device asserts support for that capability, the device will send event messages on the event topic. |
microphone |
AIA-encrypted binary | The device publishes binary audio data messages containing user speech on the microphone topic. Use of the microphone topic depends on the Microphone capability interface, where the details of its binary messages are documented. |
speaker |
AIA-encrypted binary | The device subscribes to the speaker topic to receive messages containing binary audio data for output to end users on it speakers. Use of the speaker topic depends on the Speaker capability interface, where the details of its binary messages are documented. |
1 Connection management messages do not use AIA encryption, but are nevertheless secured via standard TLS for MQTT.
Capabilities
An AIA capability is a set of messages and mechanics to implement some device functionality. It defines which MQTT topics are used for exchange of those messages, as well as their specification.
The capabilities available at launch include
- System 1.0 (which is required for all devices)
- Clock 1.0, which can be used to synchronize a device's local clock with AIA
- Speaker 1.0, for devices that output Alexa experiences on a speaker
- Microphone 1.0, for devices that take user speech input for Alexa interactions
- Alerts 1.0, for devices that can manifest Alexa timers, alarms, and reminders
When a device implements a particular capability, it must assert support for it through the Publish
message.
Capability and Envelope Versions
Each capability interface has its own major.minor version to accommodate granular updates to functionality that enable new end user experiences and operational enhancements. The capability version controls the following aspects:
- the JSON message
name
s that are exchanged on theevent
anddirective
MQTT topics - the structure and contents of the
payload
object in JSON messages for any capability-defined messagename
in theheader
- any additional MQTT topics used, such as the
speaker
topic in the Speaker capability - the structure of binary messages defined by the capability
For the overall mechanics and conventions used by AIA, there is an "envelope version", starting with v1
. This is reflected both in the topic hierarchy's <envelopeVersion>
, as well as in the URI for registration. The envelope version controls the following aspects of a device's interactions with AIA:
- outside the contents of individual
payload
objects, the JSON format of messages on various MQTT topics, likeevent
anddirective
- the structure of the common header
- the registration mechanics
- the connection mechanics
- how devices assert support for individual capabilities
- required capabilities, such as System
- state reporting mechanics
MQTT Publishing Rate and Retry Strategy
Messages may not be published on any topic faster than one message per 50 milliseconds.
Messages may be published slower, but only if it does not negatively impact user experience.
AIA also defines an exponential backoff retry strategy for a device's resending messages that failed to be delivered or successfully processed:
- The device should wait for exponentially increasing amounts of time between each attempt, up to a maximum delay of 1 hour (and then retrying every 1 hour thereafter).
- Each attempt should have added or subtracted some random randomized delay (jitter). The randomization should be unique to the individual device, not based on seeds such as timestamp.
Encryption
The device and AIA must use AES-GCM to perform end-to-end encryption of message content published on all AIA-specific MQTT topics except those under connection
.
This AIA-specific encryption is on top of the TLS-encrypted MQTT connection, used to protect Alexa-specific customer data while it is in transit through AWS IoT systems.
The common header describes which fields are encrypted.
Common Header
With the exception of messages under the connection
topic, all AIA-specific messages start with a common binary header, regardless of data type.
Component | Byte Offset | Size (Bytes) | Name | Description |
---|---|---|---|---|
Common Header | 0 | 4 | sequence |
The sequence number of the message on a particular topic. This field is an unsigned 32-bit integer stored in little-endian byte order. |
4 | 12 | IV | The encryption initialization vector. This value is used with the shared secret to decrypt the message. | |
16 | 16 | MAC | The encryption message authentication code. This value verifies the integrity of the message in transit. | |
32 | 4 | encrypted sequence |
An encrypted copy of the sequence number. When decrypted, this must match the unencrypted sequence number. Note: This and the following encrypted message field are encrypted as a single blob and must be decrypted together. |
|
Message Data | 36 | encrypted message |
Encrypted JSON or binary message. These data types define their own headers, detailed in their respective documentation. Note: This and the previous encrypted sequence field are encrypted as a single blob and must be decrypted together. |
Sequence Numbers
All AIA-specific messages have a sequence number that's scoped to the topic on which the message is sent and the particular connection.
Purpose
- AIA encryption and security
- The sequence number appears in the common header unencrypted (starting at byte 0) and encrypted (starting at byte 32).
- When the device decrypts the message, if the unencrypted and decrypted sequence numbers do not match, the device must immediately disconnect with a
code
value ofMESSAGE_TAMPERED
.
- Resequencing out-of-order messages
- For messages that arrive out of order, the device uses the sequence number to process the messages in the correct order.
- The on-device resequencing buffer for each topic should have a minimum of four slots.
Assignment
- Sequence numbers are unsigned 32-bit integers represented in little-endian byte order.
- When a new connection is established, the sequence number on each topic is reset to binary
0
. - The sequence number increments by 1 with each message on the topic.
- If more than 232 = 4,294,967,296 messages are sent on a topic on the same connection, the sequence number overflows and wraps to
0
.
Data Types
AIA v1
defines two data types: JSON and binary stream. Each AIA-specific MQTT topic uses one or the other of these data types.
JSON
Topics that use the JSON data type publish messages consisting of the common header followed by an AIA-encrypted† JSON string.
A JSON string is an ASCII-encoded string consisting of a single JSON object. Individual capability interfaces and the capability asserion mechanism define the JSON messages.
The size of the JSON messages will vary by topic: Some topics may approach the 128-KB limit for MQTT messages, while others may remain significantly beneath it.
A single JSON object should never be split across multiple messages. In addition, only one JSON object is permitted per message. Multiple objects in a single message or an incomplete JSON object will be treated as errors.
The documentation for JSON messages describes the value types of all fields. These include object, list, boolean, string, and long. Unless otherwise noted, "long" is a 64-bit unsigned integer, represented as a JSON number.
† In AIA v1
, there is one exception to messages being AIA-encrypted. Connection management messages, which use the fromclient
and fromservice
MQTT topics under the connection
topic, do not use AIA encryption, but are still secured with TLS. Consequently, connection
messages are not preceded by the common header.
Binary Stream
Topics used for binary stream messages publish messages consisting of the common header, followed by a binary stream header, followed by a binary data payload. Some binary stream types may reserve their own "header" bytes for additional metadata about the substantive payload.
Component | Byte Offset | Size (Bytes) | Name | Description |
---|---|---|---|---|
Binary Stream Header | 0 | 4 | length |
The length in bytes of the data in this binary stream message, including only the audio stream header and payload. (The common header and binary stream header should not be included.) This field is an unsigned 32-bit integer stored in little-endian byte order. |
4 | 1 | type |
The type of the binary stream message. Possible values are determined by the capability interface that defines the message. For example, on the speaker topic in the Speaker capability, 0 signifies that the message contains audio to be played, while 1 signifies an audio marker.
|
|
5 | 1 | count |
The 0-indexed number of data chunks in this message, as further specified in each capability interface. Possible Values: 0 -255 , signifying the number of chunks in the message (1-256, respectively).
|
|
6 | 2 | reserved | These bytes are reserved for alignment and backward-compatible future use. They will be set to binary 0s. | |
Binary Data Payload | 8 | 0 | data | The binary data defined by the type. This payload may define its own sub-header and sub-payload fields, such as the Audio Data type in the Speaker and Microphone capabilities. |
See a full MQTT payload structure example on the speaker
topic.
Getting Started: First Boot
- The device needs to set its clock to an accurate time before establishing any TLS connection. (Both registration and AWS IoT require TLS encryption for messages.) This may be done by performing an NTP query.
- The device must be able to perform an OTA update before completing registration successfully. This may be implemented by connecting to AWS IoT initially to check for an OTA update, disconnecting, then proceeding with registration. Alternatively, the device may attempt to register, and if registration fails, connect to AWS IoT to check for an OTA update, then retry registration.
- After registration, the device must connect to AIA.
- The device must assert support for the capabilities it implements.
- The device may then begin exchanging messages with AIA on any topics required by its supported capabilities.
Note: The device is not required to have more than one network connection open at a time. It may connect to the registration endpoint, disconnect, and then connect to AWS IoT.