Speech Synthesis Markup Language, or SSML, is a standardized markup language that allows developers to control pronunciation, intonation, timing, and emotion. SSML support on Alexa allows you to control how Alexa generates speech from your skill’s text responses. You can add pauses, change pronunciation, spell out a word, add short audio snippets, and insert speechcons (special words and phrases) into your skill. These SSML features provide a more natural voice experience.
Today, we are excited to announce five new SSML tags in the US, UK, and Germany that you can use with Alexa, including whispers, expletive bleeps, and more.
In addition, today we also rolled out speechcons in the UK and Germany. Let me explain what they are and how to use them.
The new amazon:effect tag coupled with the name: “whispered” allows Alexa to convey a softer dialog. Notice in the sample below, that amazon:effect requires a closing tag.
To hear Alexa whisper, copy the example above and paste it into the voice simulator on the developer portal, as shown below.
Sometimes you need to “bleep” a word or two out to make content acceptable for a general audience–this is exactly what interpret-as="expletive" does. It bleeps out a word that may cause offense. Notice that “expletive” is used with the <say-as> tag.
By itself, <sub> is a little less intuitive than the rest of this group. You can use this tag when you want Alexa to say something other than what is written. For example, if you want her to speak the full words "aluminum or magnesium” rather than the just say their initials, you would use <sub> like this:
For extra credit (and a laugh!) try <sub alias="aluminum">Al</sub> in American (en-US) and <sub alias="aluminium">Al</sub> in British (en-UK) voice simulators.
The emphasis tag allows you to change the rate and volume at which Alexa speaks. Remember when you were little (and in trouble), a parent would begin talking low and slow and this got your attention? Well, that’s exactly what this does for Alexa and the tenor of the conversation. It varies dialog and thus maintains engagement.
Try the various options in the voice smulator: none, moderate, strong, reduced. Note that "reduced" lowers volume and increases speed, which reduces emphasis.
Finally, <prosody> provides the ultimate control over volume, pitch, and rate of speech for Alexa. But with greater control comes greater responsibility. While it’s fun to make Alexa sound like ET, it’s really not what we’re aiming for here. To maintain intelligible speech and to provide the best user experience, the amount of change applied to rate, pitch, and volume are limited.
Try these examples in the voice simulator on the developer portal.
In this post we’ve learned five new SSML tags that help control speech output for Alexa. Now it’s time to put theory into practice. Take a look at the quiz game template, which makes use of SSML. Assemble the basic skill and get it working. Then make it your own and see if you can add any of the new tags we’ve just reviewed.
For more information about getting started with Alexa and SSML, check out the following:
And join us for a live webinar on SSML on May 18. We'll walk through all of the supported tags and show you how to level up your Alexa responses with SSML.