API Mechanics

Topics

Terminology

Where the device receives messages from AIA, such as on the directive and speaker topics, the device must subscribe to the relevant topics.

Where the device sends messages to AIA, such as on the event and microphone topics, the device publishes on the relevant topics.

Throughout this documentation, general use of a topic for either receiving or sending messages is described as participating in that topic.

Message Planes

The MQTT topic hierarchy below can be thought of as supporting two separate message planes:

Control messages often need to communicate that an action must be taken at a certain location in the corresponding data stream. This communication is done through the use of a binary offset. Audio Data messages (as in the Speaker and Microphone capabilities) define a binary offset field in their Audio Stream Header that indicates the position of the message in the binary stream. Control messages that need to take effect at specific locations in a data stream will include the binary offset for the relevant stream.

There are two types of control messages:

  • directives, which allow AIA to control the device and data streaming to it
  • events, which allow the device to inform AIA of local activities, states, and changes

See the Event-Directive Control Plane documentation to understand the structure of event and directive messages.

Hierarchy

AIA Topic Hierarchy

MQTT topics through AWS IoT have a directory-like nested structure. All messages, binary and JSON, are exchanged between device and AIA in leaf topics.

For AIA generally, the relative MQTT root is $aws/alexa/ais. The current AIA envelope version, represented as <envelopeVersion>, is v1. This defines the remaining MQTT topic hierarchy. You do not have to hard-code this value, as AIA will return this value in the iot.topicRoot field in the registration flow.

In the v1 envelope, all MQTT topics used with a specific device are under <clientId>, which is the client ID of the device, as registered with AWS IoT and as communicated to AIA through the registration flow in the iot.clientId field.

Therefore, for a given device communicating with AIA, all MQTT topics are relative to $aws/alexa/ais/v1/<clientId>:

Topic Data and Encryption Type Description
connection Connection management between devices and AIA uses the parent connection topic.
connection/
  fromclient
Unencrypted1 JSON
using IAM permissions
The device sends connection management messages, such as Connect, on the fromclient topic.
connection/
  fromservice
Unencrypted1 JSON
using IAM permissions
AIA sends connection management messages, such as Acknowledge, on the fromservice topic.
capabilities The device's assertion of capabilities uses the parent capabilities topic.
capabilities/
  publish
AIA-encrypted JSON The device sends the Publish capability assertion message on the publish topic.
capabilities/
  acknowledge
AIA-encrypted JSON AIA acknowledges the device's assertion of capabilities through the Acknowledge message on the acknowledge topic.
directive AIA-encrypted JSON When a capability interface defines a directive and the device asserts support for that capability, AIA will send directive messages on the directive topic.
event AIA-encrypted JSON When a capability interface defines an event and the device asserts support for that capability, the device will send event messages on the event topic.
microphone AIA-encrypted binary The device publishes binary audio data messages containing user speech on the microphone topic. Use of the microphone topic depends on the Microphone capability interface, where the details of its binary messages are documented.
speaker AIA-encrypted binary The device subscribes to the speaker topic to receive messages containing binary audio data for output to end users on it speakers. Use of the speaker topic depends on the Speaker capability interface, where the details of its binary messages are documented.

1 Connection management messages do not use AIA encryption, but are nevertheless secured via standard TLS for MQTT.

Capabilities

An AIA capability is a set of messages and mechanics to implement some device functionality. It defines which MQTT topics are used for exchange of those messages, as well as their specification.

The capabilities available at launch include

  • System 1.0 (which is required for all devices)
  • Clock 1.0, which can be used to synchronize a device's local clock with AIA
  • Speaker 1.0, for devices that output Alexa experiences on a speaker
  • Microphone 1.0, for devices that take user speech input for Alexa interactions
  • Alerts 1.0, for devices that can manifest Alexa timers, alarms, and reminders

When a device implements a particular capability, it must assert support for it through the Publish message.

Capability and Envelope Versions

Each capability interface has its own major.minor version to accommodate granular updates to functionality that enable new end user experiences and operational enhancements. The capability version controls the following aspects:

  • the JSON message names that are exchanged on the event and directive MQTT topics
  • the structure and contents of the payload object in JSON messages for any capability-defined message name in the header
  • any additional MQTT topics used, such as the speaker topic in the Speaker capability
  • the structure of binary messages defined by the capability

For the overall mechanics and conventions used by AIA, there is an "envelope version", starting with v1. This is reflected both in the topic hierarchy's <envelopeVersion>, as well as in the URI for registration. The envelope version controls the following aspects of a device's interactions with AIA:

MQTT Publishing Rate and Retry Strategy

Messages may not be published on any topic faster than one message per 50 milliseconds.

Messages may be published slower, but only if it does not negatively impact user experience.

AIA also defines an exponential backoff retry strategy for a device's resending messages that failed to be delivered or successfully processed:

  • The device should wait for exponentially increasing amounts of time between each attempt, up to a maximum delay of 1 hour (and then retrying every 1 hour thereafter).
  • Each attempt should have added or subtracted some random randomized delay (jitter). The randomization should be unique to the individual device, not based on seeds such as timestamp.

Encryption

The device and AIA must use AES-GCM to perform end-to-end encryption of message content published on all AIA-specific MQTT topics except those under connection.

This AIA-specific encryption is on top of the TLS-encrypted MQTT connection, used to protect Alexa-specific customer data while it is in transit through AWS IoT systems.

The common header describes which fields are encrypted.

Common Header

With the exception of messages under the connection topic, all AIA-specific messages start with a common binary header, regardless of data type.

Component Byte Offset Size (Bytes) Name Description
Common Header 0 4 sequence The sequence number of the message on a particular topic.

This field is an unsigned 32-bit integer stored in little-endian byte order.
4 12 IV The encryption initialization vector. This value is used with the shared secret to decrypt the message.
16 16 MAC The encryption message authentication code. This value verifies the integrity of the message in transit.
32 4 encrypted sequence An encrypted copy of the sequence number. When decrypted, this must match the unencrypted sequence number.

Note: This and the following encrypted message field are encrypted as a single blob and must be decrypted together.
Message Data 36 encrypted message Encrypted JSON or binary message. These data types define their own headers, detailed in their respective documentation.

Note: This and the previous encrypted sequence field are encrypted as a single blob and must be decrypted together.

Sequence Numbers

All AIA-specific messages have a sequence number that's scoped to the topic on which the message is sent and the particular connection.

Purpose

  • AIA encryption and security
    • The sequence number appears in the common header unencrypted (starting at byte 0) and encrypted (starting at byte 32).
    • When the device decrypts the message, if the unencrypted and decrypted sequence numbers do not match, the device must immediately disconnect with a code value of MESSAGE_TAMPERED.
  • Resequencing out-of-order messages
    • For messages that arrive out of order, the device uses the sequence number to process the messages in the correct order.
    • The on-device resequencing buffer for each topic should have a minimum of four slots.

Assignment

  • Sequence numbers are unsigned 32-bit integers represented in little-endian byte order.
  • When a new connection is established, the sequence number on each topic is reset to binary 0.
  • The sequence number increments by 1 with each message on the topic.
  • If more than 232 = 4,294,967,296 messages are sent on a topic on the same connection, the sequence number overflows and wraps to 0.

Data Types

AIA v1 defines two data types: JSON and binary stream. Each AIA-specific MQTT topic uses one or the other of these data types.

JSON

Topics that use the JSON data type publish messages consisting of the common header followed by an AIA-encrypted JSON string.

A JSON string is an ASCII-encoded string consisting of a single JSON object. Individual capability interfaces and the capability asserion mechanism define the JSON messages.

The size of the JSON messages will vary by topic: Some topics may approach the 128-KB limit for MQTT messages, while others may remain significantly beneath it.

A single JSON object should never be split across multiple messages. In addition, only one JSON object is permitted per message. Multiple objects in a single message or an incomplete JSON object will be treated as errors.

The documentation for JSON messages describes the value types of all fields. These include object, list, boolean, string, and long. Unless otherwise noted, "long" is a 64-bit unsigned integer, represented as a JSON number.

In AIA v1, there is one exception to messages being AIA-encrypted. Connection management messages, which use the fromclient and fromservice MQTT topics under the connection topic, do not use AIA encryption, but are still secured with TLS. Consequently, connection messages are not preceded by the common header.

Binary Stream

Topics used for binary stream messages publish messages consisting of the common header, followed by a binary stream header, followed by a binary data payload. Some binary stream types may reserve their own "header" bytes for additional metadata about the substantive payload.

Component Byte Offset Size (Bytes) Name Description
Binary Stream Header 0 4 length The length in bytes of the data in this binary stream message, including only the audio stream header and payload. (The common header and binary stream header should not be included.)

This field is an unsigned 32-bit integer stored in little-endian byte order.
4 1 type The type of the binary stream message.

Possible values are determined by the capability interface that defines the message.

For example, on the speaker topic in the Speaker capability, 0 signifies that the message contains audio to be played, while 1 signifies an audio marker.
5 1 count The 0-indexed number of data chunks in this message, as further specified in each capability interface.

Possible Values: 0-255, signifying the number of chunks in the message (1-256, respectively).
6 2 reserved These bytes are reserved for alignment and backward-compatible future use. They will be set to binary 0s.
Binary Data Payload 8 0 data The binary data defined by the type. This payload may define its own sub-header and sub-payload fields, such as the Audio Data type in the Speaker and Microphone capabilities.

See a full MQTT payload structure example on the speaker topic.

Getting Started: First Boot

  1. The device needs to set its clock to an accurate time before establishing any TLS connection. (Both registration and AWS IoT require TLS encryption for messages.) This may be done by performing an NTP query.
  2. The device must be able to perform an OTA update before completing registration successfully. This may be implemented by connecting to AWS IoT initially to check for an OTA update, disconnecting, then proceeding with registration. Alternatively, the device may attempt to register, and if registration fails, connect to AWS IoT to check for an OTA update, then retry registration.
  3. After registration, the device must connect to AIA.
  4. The device must assert support for the capabilities it implements.
  5. The device may then begin exchanging messages with AIA on any topics required by its supported capabilities.

Note: The device is not required to have more than one network connection open at a time. It may connect to the registration endpoint, disconnect, and then connect to AWS IoT.