SpeechSynthesizer 1.0
When you ask Alexa a question, the SpeechSynthesizer interface returns the appropriate speech response.
For example, if you ask Alexa "What's the weather in Seattle?", your client receives a Speak
directive from the Alexa Voice Service (AVS). This directive contains a binary audio attachment with the appropriate answer, which you must process and play.
States
SpeechSynthesizer has the following states:
- PLAYING - When Alexa speaks, SpeechSynthesizer is in the
PLAYING
state. SpeechSynthesizer transitions to theFINISHED
state when speech playback completes. - FINISHED - When Alexa finishes speaking, SpeechSynthesizer transitions to the
FINISHED
state with aSpeechFinished
event.
Capability assertion
SpeechSynthesizer 1.0 may be implemented by the device on its own behalf, but not on behalf of any connected endpoints.
New AVS integrations must assert support through Alexa.Discovery, but Alexa will continue to support existing integrations using the Capabilities API.
Sample Object
{ "type": "AlexaInterface", "interface": "SpeechSynthesizer", "version": "1.0" }
Context
For each currently playing TTS that requires context, your client must report playerActivity
and offsetInMilliseconds
.
To learn more about reporting Context, see Context Overview.
Sample Message
{ "header": { "namespace": "SpeechSynthesizer", "name": "SpeechState" }, "payload": { "token": "{{STRING}}", "offsetInMilliseconds": {{LONG}}, "playerActivity": "{{STRING}}" } }
Payload Parameters
Parameter | Description | Type |
---|---|---|
token | An opaque token provided in the Speak directive. |
string |
offsetInMilliseconds | Identifies the current TTS offset in milliseconds. | long |
playerActivity | Identifies the component state of SpeechSynthesizer
Accepted Values: PLAYING , FINISHED or INTERRUPTED |
string |
Player Activity | Description |
---|---|
PLAYING |
Speech is playing. |
FINISHED |
Speech finished playing. |
Directives
Speak
AVS sends a Speak
directive to your client every time Alexa delivers a speech response. There are two different ways to receive a Speak
directive, including:
- When a user makes a voice request, such as asking Alexa a question. AVS sends a
Speak
directive to your client after it receives a Recognize event. - When a user preforms an action, such as setting a timer. First, the timer starts with the
SetAlert
directive. Second, AVS sends aSpeak
directive to your client, notifying you that the timer started.
Sample Message
The Speak
directive is a multipart message containing two different formats – one JSON-formatted directive and one binary audio attachment.
JSON
{ "directive": { "header": { "namespace": "SpeechSynthesizer", "name": "Speak", "messageId": "{{STRING}}", "dialogRequestId": "{{STRING}}" }, "payload": { "url": "{{STRING}}", "format": "{{STRING}}", "token": "{{STRING}}" } } }
Binary Audio Attachment
The following multipart headers precede the binary audio attachment.
Content-Type: application/octet-stream Content-ID: {{Audio Item CID}} {{BINARY AUDIO ATTACHMENT}}
Header Parameters
Parameter | Description | Type |
---|---|---|
messageId | A unique ID used to represent a specific message. | string |
dialogRequestId | A unique ID used to correlate directives sent in response to a specific Recognize event. |
string |
Payload Parameters
Parameter | Description | Type |
---|---|---|
url | A unique identifier for audio content. The URL always follows the prefix cid: .
Example: cid:{{STRING}} |
string |
format | Provides the format of returned audio.
Accepted value: "AUDIO_MPEG" |
string |
token | An opaque token that represents the current text-to-speech (TTS) object. | string |
Events
SpeechStarted
Send the SpeechStarted
event to AVS after your client processes the Speak
directive and begins playback of synthesized speech.
Sample Message
{ "event": { "header": { "namespace": "SpeechSynthesizer", "name": "SpeechStarted", "messageId": "{{STRING}}" }, "payload": { "token": "{{STRING}}" } } }
Header Parameters
Parameter | Description | Type |
---|---|---|
messageId | A unique ID used to represent a specific message. | string |
Payload Parameters
Parameter | Description | Type |
---|---|---|
token | The opaque token provided by the Speak directive. |
string |
SpeechFinished
When Alexa finishes speaking, send the SpeechFinished
event. Send the event only after Alexa fully processes the Speak
directive and finishes rendering the TTS. If a user cancels TTS playback, the SpeechFinished
event doesn't send. For example, if a user interrupts the Alexa TTS with "Alexa, stop," send a SpeechFinished
event.
Sample Message
{ "event": { "header": { "namespace": "SpeechSynthesizer", "name": "SpeechFinished", "messageId": "{{STRING}}" }, "payload": { "token": "{{STRING}}" } } }
Header Parameters
Parameter | Description | Type |
---|---|---|
messageId | A unique ID used to represent a specific message. | string |
Payload Parameters
Parameter | Description | Type |
---|---|---|
token | The opaque token provided by the Speak directive. |
string |