Speech (APL for Audio)

A Speech component converts the provided text into speech output. You can provide either plain text or SSML as input.


The Speech component has the base component properties and the following component-specific properties:

Property Type Default Description
content String "" The text content to convert to speech.
contentType One of PlainText, SSML PlainText The type of content provided in the content property.


The content to convert from text to speech. The content can be any UTF-8 string.

The content of a Speech component must not contain profanity. Any profanity is replaced with beeps during conversion.


Defines the type of content to convert from text to speech.

Tag Description
SSML The audio engine validates the content and enforces SSML syntax before converting the text to speech.
PlainText The audio engine converts the content to speech as normal text.

When contentType is SSML, and the SSML provided in content is syntactically incorrect, the audio engine fails to render the document and the request fails.

When you provide SSML content in the content property, be sure to enclose the speech within <speak> tags as you would in a normal outputSpeech response.

Since you must provide the SSML in a JSON object, either escape the quotation marks, or use an appropriate mix of single and double quotation marks.

For a reference on supported SSML syntax and what's supported, see the Speech Synthesis Markup Language (SSML) Reference.

When contentType is PlainText, any SSML content inside the content string is encoded. This results in Alexa "speaking out" the SSML tags instead of the intended behavior.


Plain text example

The following example demonstrates a Speech component that speaks the text "Hello user!".

Copied to clipboard.

  "type": "Speech",
  "contentType": "PlainText",
  "content": "Hello user!"

SSML example

The following example demonstrates a Speech component with SSML. This example speaks the text "Hello user!" with the "whisper" effect.

Copied to clipboard.

  "type": "Speech",
  "contentType": "SSML",
  "content": "<speak><amazon:effect name='whispered'>Hello user!</amazon:effect></speak>"

For details and more examples of SSML tags, see Speech Synthesis Markup Language (SSML) Reference.