Alexa Voice Service v20160207


The Alexa Voice Service (AVS) allows developers to enable voice interactions on connected products. After integrating AVS, your product has access to the built-in functionalities of Alexa, such as music playback, timers and alarms, package tracking, movie listings, and calendar management. You can extend the voice features of your product by either developing your own Alexa skills or Alexa skills built by other developers with the Alexa Skills Kit.

AVS comprises capability interfaces that correspond to client functionality, like speech recognition, audio playback, and volume control.

AVS uses Login with Amazon (LWA) for product authorization and exposes HTTP/2 endpoints.

The current version of the Alexa Voice Service is v20160207. Read more about AVS Envelope and Capability Versioning.

What's new?

For new and updated AVS features, see blogs and documentation in the documentation changelog.

Authorization

To access the AVS API, your product must obtain a Login with Amazon (LWA) access token, which grants a product access to call the API for a user. There are multiple ways to authorize a product:

Remote Authorization authorize devices with a mobile app. Typically, remote authorization is used with headless devices, like a smart speaker.

Local Authorization is used to authorize Alexa from the AVS-enabled product. Typically, local authorization is used with Android and iOS apps.

Code Based Linking is an authorization method optimal for products with limited or no access to character input, such as a television or smart watch.

Connected endpoints

Throughout the AVS documentation, the entity that maintains the HTTP/2 connection to Alexa is referred to as the "device," "client," or "product." In basic integrations, the device is the only entity that is the subject of user interactions and exchanges of messages with Alexa.

AVS also enables connecting additional "endpoints" to Alexa. (Note that this isn't the same use of the term as "URI endpoint" for API connections.) These connected endpoints don't maintain their own direct connections to Alexa, relying instead on the HTTP/2 device to proxy events and directives on their behalf.

From Alexa's perspective, the device and all other connected endpoints are all "endpoints". This is reflected in how a device reports both itself and connected endpoints in the same endpoints list in Alexa.Discovery. Where certain capability interfaces, mechanics, or fields apply only to the device maintaining the HTTP/2 connection, the documentation will make that clear. Otherwise, "endpoint" can refer to either the device or any connected endpoints.

Topology

Connected endpoints are analogous to those described by the Smart Home Skills Kit. When an endpoint connects through a skill, it requires that the developer maintains a cloud service that is called by an AWS Lambda function. That cloud service exchanges events and directives with Alexa on behalf of any endpoints. Therefore, the developer is responsible for the connection between the cloud service and the endpoint.

When the endpoint connects to an AVS device, the device takes the place of the cloud service, informing Alexa about the endpoint through Alexa.Discovery. When a capability interface is suitable for a connected endpoint, it will include an endpoint object in events and directives, so that when the device exchanges messages with Alexa on behalf of the endpoint, both the device and Alexa know which connected endpoint the message is about. More information is available in the Envelope Version documentation, as well as the documentation for individual capability interfaces.

Think of the device as a proxy for the endpoint, offering an abstraction layer to connected endpoints for AVS.

Note that, unlike the cloud service in a traditional Smart Home Skills integration, the device is also itself an endpoint.

Endpoint types and connectivity

Endpoints may be physical or virtual. A physical endpoint might be a light bulb, thermostat, camera, or smart lock. A virtual endpoint might be a piece of software running on the device, like an installed app.

Endpoints might be separate physical entities or integrated components. For instance, a light bulb could be separate, perhaps in a ceiling lighting fixture. Alternatively, it could be integrated into the HTTP/2 device and modeled as a connected endpoint for separate control and targeting.

Alexa-defined JSON events and directives may be passed raw between the device and endpoint or translated into a different protocol, format, encoding, or signal. The only requirement is that the semantics are preserved.

If a connected endpoint is aware of Alexa, the endpoint could run software to understand and use Alexa events and directives. In such cases, the HTTP/2 device can pass messages between the endpoint and Alexa without translation, over whatever connection exists between the device and the endpoint.

By contrast, if a connected endpoint isn't Alexa-specific, the HTTP/2 device may need to translate the Alexa messages into the appropriate signal. For example, the device may receive a TurnOn directive for a particular light bulb endpoint. The directive might then be translated into the appropriate Zigbee signal that turns on the light bulb. This applies to virtual endpoints as well. For instance, if the device receives a directive to control an Android app that is also installed on the device, it might need to translate that directive into the appropriate Android API, then translate any response into the corresponding Alexa event.

Interfaces, directives, and events

The Alexa Voice Service (AVS) is an aggregation of various fine-grained interfaces. Each interface is a collection of directives and events, which correspond to specific device functionality.

  • Directives are messages sent from AVS telling a device to perform a specific action like playing audio from a distinct URL or setting an alarm.
  • Events are messages sent from a device to AVS notifying Alexa something has occurred. The most common event is a speech request from your user.

The following table provides a brief description of each interface exposed by the AVS API:

Interface Description
Alerts The interface for setting, stopping, and deleting timers and alarms. For a conceptual overview, see Alerts Overview.
AudioActivityTracker The interface that is used to inform Alexa which interface last occupied an audio channel.
AudioPlayer The interface for managing and controlling audio playback that originates from an Alexa-managed queue. For a conceptual overview, see AudioPlayer Overview.
Bluetooth The interface for managing connections with peer Bluetooth devices, such as smart phones and speakers.
EqualizerController This interface allows a product to adjust equalizer settings using Alexa, such as decibel (dB) levels and modes.
InputController This interface enables selecting and switching inputs on an Alexa-enabled product.
Notifications The interface that delivers visual and audio indicators when notifications are available. For a conceptual overview, see Notifications Overview.
PlaybackController The interface for navigating a playback queue via button presses or GUI affordances.
Settings The interface that is used to manage the Alexa settings on your product, such as locale.
Speaker The interface for controlling the volume of Alexa originated content on your product, including mute and unmute.
SpeechRecognizer The core interface for the Alexa Voice Service. Each user utterance leverages the Recognize event.
SpeechSynthesizer The interface that returns Alexa TTS.
System The interface that is used to send Alexa information about your product.
TemplateRuntime The interface for rendering visual metadata. For a conceptual overview, see Display Cards Overview.
VisualActivityTracker The interface that is used to inform Alexa when content is actively displayed to an end user.

Capability interfaces

A capability interface ("capability", "interface", "API", or "namespace" for short) comprises a set of client functionality, like speech recognition, audio playback, and volume control.

Each capability interface defines logically grouped messages called directives and events. Alexa sends directives to the device, instructing it to take action. The device sends events to Alexa to indicate that something has occurred.

A capability interface might also define context entries, either of a generic variety or reportable state properties.

Individual messages, including context entries, include a namespace field identifying the capability interface that the event, directive, or context entry belongs to. The event, directive, or context entry is identified by the name field.

Changes

Changes to capabilities are managed through major.minor interface versions.

Assertion

Capabilities are always implemented by the device maintaining the HTTP/2 connection, but the device may implement them on behalf of connected endpoints. The documentation for each interface describes whether the device may implement it only on its own behalf or on behalf of any connected endpoints.

To assert support for a capability on its own behalf or on behalf of a connected endpoint, the device must use Alexa.Discovery (the successor to the deprecated Capabilities API).

The individual capability's documentation details how to assert support.

Interface naming conventions

When AVS first launched, the namespace of each capability interface was a single PascalCased word describing the functionality it enabled. As the number of interfaces increased, it became necessary to create more structure in their naming to more easily group together related functionality:

  • Newer interfaces leverage hierarchical namespaces, starting with the Alexa. root.
  • Each component of the namespace between periods still uses PascalCase.
  • A namespace doesn't necessarily depend on implementation of its hierarchical parents. For example, the Alexa.DoNotDisturb interface does not depend on the Alexa interface. Any dependencies are indicated in the documentation for the interface.

HTTP/2

AVS exposes an HTTP/2 service and expects multipart messages encoded for HTTP/2. The following pages provide information to help you manage a connection and structure requests.

Base URLs

As of May 22, 2019, the default base URLs for AVS have changed. We recommend that all new and existing clients adopt these new URLs; however, the legacy base URLs will continue to be supported.

Base URLs

Region Supported Countries/Regions URL
Asia Australia, Japan, New Zealand https://alexa.fe.gateway.devices.a2z.com
Europe Austria, France, Germany, India, Italy, Saudi Arabia, United Arab Emirates, Spain, United Kingdom https://alexa.eu.gateway.devices.a2z.com
North America Brazil, Canada, Mexico, United States https://alexa.na.gateway.devices.a2z.com

Legacy Base URLs

Region Supported Countries/Regions URL
Asia Australia, Japan, New Zealand https://avs-alexa-fe.amazon.com
Europe Austria, France, Germany, India, Italy, Saudi Arabia, United Arab Emirates, Spain, United Kingdom https://avs-alexa-eu.amazon.com
North America Brazil, Canada, Mexico, United States https://avs-alexa-na.amazon.com

Device support for AVS updates

As AVS periodically introduces new features, the updates might introduce new directives or add new properties to existing directives. Keep these updates in mind when developing your device software, including JSON parser code, which shouldn't break when encountering a new directive or payload property. For more details, see Voice Request Lifecycle

For an example, see MessageInterpreter.cpp in the AVS Device SDK.

Need help?

If you have any questions, comments, or encounter issues with the AVS API, visit Stack Overflow.


Was this page helpful?

Last updated: Nov 27, 2023