We’re excited to introduce two new Alexa capabilities that will help create a more natural and intuitive voice experience for your customers. Starting today, you can enable Alexa to respond with either a happy/excited or a disappointed/empathetic tone in the US. Emotional responses are particularly relevant to skills in the gaming and sports categories. Additionally, you can have Alexa respond in a speaking style that is more suited for a specific type of content, starting with news and music. Speaking styles are curated text-to-speech voices designed to create a more delightful customer experience for specific content. For example, the news speaking style makes Alexa’s voice sound similar to what you hear from TV news anchors and radio hosts. To learn more, check out our technical documentation for emotions here and speaking styles here.
Alexa emotions use Neural TTS (NTTS) technology, Amazon’s text-to-speech technology that enables more natural sounding speech. For example, you can have Alexa respond in a happy/excited tone when a customer answers a trivia question correctly or wins a game. Similarly, you can have Alexa respond in a disappointed/empathetic tone when a customer asks for the sports score and their favorite team has lost. Early customer feedback indicates that overall satisfaction with the voice experience increased by 30% when Alexa responded with emotions. Check out the following examples and compare them to the neutral tone:
Alexa’s new speaking styles also use Neural TTS (NTTS) technology. Starting today, you can enable 2 different speaking styles in the US: news and music. In Australia, you can enable an Australia-specific news speaking style. The news and music speaking styles tailor Alexa’s voice to the respective content being delivered by changing aspects of speech such as intonation, which words are emphasized, and the timing of pauses. While conducting ‘blind listening’ tests, the news style was perceived to be 31% more natural than Alexa’s standard voice and the music style was perceived to be 84% more natural. Check out the following examples and compare them to the standard voice.
To get started with Alexa emotions, you can use the newly published SSML tags. Simply wrap Alexa’s response with the appropriate SSML tag (‘excited’ or ‘disappointed’) and the level of intensity with which the emotion should be applied to the response (‘low’, ‘medium’ or ‘high’). Please note that both parameters (type of emotion and intensity) must be specified for the SSML tag to work correctly:
<amazon:emotion name="excited" intensity="medium">Christina wins this round!</amazon:emotion>
<amazon:emotion name="disappointed" intensity="high">Here I am with a brain the size of a planet and they ask me to pick up a piece of paper.</amazon:emotion>
To get started with speaking styles, use the syntax associated with the appropriate speaking style below:
<amazon:domain name="news">TA miniature manuscript written by the teenage Charlotte Bronte is returning to her childhood home in West Yorkshire after it was bought by a British museum at auction in Paris. </amazon:domain>
<amazon:domain name="music">Sweet Child O’ Mine by Guns N’ Roses became one of their most successful singles, topping the billboard Hot 100 in 1988. Slash’s guitar solo on this song was ranked the 37th greatest solo of all time. Here’s Sweet Child O’ Mine. </amazon:domain>
Please note that skill developers in Australia should use the same <amazon:domain name=“news”> SSML tag, and we will automatically use the news (AU) speaking style based on the locale of the skill.