A voice experience is a two-way interaction between a user and a system based upon the fundamentals of human conversation. For an effective conversation to take place between people, speakers should have a shared mental model and communicate their goals, questions, and intentions. The more that your voice experience can make use of human conversation, the less you have to teach users how to use the system.
When you design a voice experience between a user and a system, the role of the designer is to map out these conversations, delving into the user's needs and journey, as well as storyboarding out a full experience. Whether acting as a lone skill builder in a design phase or working as a designer with a team of developers, iterating on these designs helps make a compelling experience that users will love.
The process of designing voice experiences includes several phases: conceptualizing your voice-experience idea, drawing out your users' journey, designing the personality of your skill, creating storyboards, prototyping your designs, and testing your designs and iterating to make improvements.
Users are attracted to voice experiences because of their strengths. The following list shows the strengths of voice user interfaces (VUI):
In addition, users look to skills to add value into their everyday life. Here are some good questions to ask yourself when conceptualizing an Alexa skill.
A user journey is composed of the interactions a user has with your brand of products and the goals they complete within them.
Let’s dig in a little deeper into what that means.
Interactions are how a user steps through a product. On a website, these interactions are clicks through hyperlinks. On mobile devices, these are various taps and swipes. And through Alexa, these are user utterances.
Goals are what your user is attempting to achieve along the way with your product. Most of the time, a user is trying to achieve these goals in a simple, quick, frictionless way.
When you draw out a user journey, it should follow this format.
These user journeys shouldn’t be viewed as flow diagrams. The diagrams always go from one goal to the next and they never branch. The focus is not on showing all the possible paths a user might go down, but instead on the ideal paths that lead the user toward their goals.
Alexa skills by their nature are two-way dialogs. A user speaks and your skill responds. Both the user and your skill are working together to achieve a goal. Now it’s time for us to get into the skill response side of the dialog. Instead of simply going straight into dialog writing, however, you should step back and come up with a clear picture of the personality of your skill. This personality is a combination of your skill’s spoken and visual identity. Your personality should feel like one singular voice to the user.
To help build out this personality, we have a few simple steps:
To help get you started, download the following quick reference sketch file.
When you design a skill, there’s a lot to consider from both the user's and Alexa's perspective.
With a user, you need to design for the following elements:
With Alexa, you need to design for the following responses:
To combine all these elements together into a single artifact, you use a design method called storyboarding. A storyboard is the design artifact composed of a user journey, screens, and scripts. Storyboards are helpful to organize and convey to others how the users will interact with the skill.
The following example is a storyboard for a skill in which someone can order cake from a bakery.
The user journey section of the storyboard is where you give the context of all the goals of a user from the start until the end of their journey. The storyboard walks you through this user journey. Each frame of the storyboard focuses on a specific goal. In this case, the user goal is for a user to find a cake to order.
The screen section of the storyboard is where you give an example of how the screen will be displayed. When you first start out with your screens, don’t worry about laying them out perfectly, but instead focus on what will be the main content.
The script is where you write out both the user utterances and Alexa’s responses. It might seem intimidating to begin writing out these scripts. Have the user’s goal (finding a cake) as your guide when you write out your dialog. Try to be as simple and straight forward as possible to reach this goal.
To help you get started in storyboarding, see the Sketch files. In the files, you’ll find storyboards to fill in your own skill types.
A prototype is useful for communicating your design with stakeholders, as well as putting your concepts in front of users for feedback. After a few storyboards, you can use them as a guide to begin prototyping your design.
The following tools can help you make this process simple.
With Adobe XD, you can design APL screens more easily by using the Adobe XD UI Kit. You can stitch together the screens with interactions, such as tap, touch, and voice triggers. When you transfer between screens, Adobe XD enables speech playback and visual display. You can view the prototype within Adobe XD and on your Echo Show device. Simply say, “Alexa Open Adobe XD” to try your own device prototype.
The Alexa Design System Sketch toolkit includes libraries and templates to design multimodal skills built with the Alexa Presentation Language (APL). These libraries and templates represent the code-backed Alexa styles and Alexa layout packages. The responsive templates and responsive components automatically adapt to different viewport profiles. Amazon updates the toolkit with every major release of APL, so you always have the most advanced tools for your design.
The downloadable toolkit includes the following features:
Spoken words aren't the same as written text. As described in our article about designing your skill’s persona: spoken word can differ in tone, pitch, rate of speech, and stress on words. A voice can be soothing or startling. Using the Amazon Polly Text-to-Speech tool you can listen to what you’ve written in your scripts and download example Alexa responses. The Amazon Polly Text-to-Speech tool also allows you to use Speech Synthetic Markup Language (SSML) to add pauses and other speech effects to your speech output.
After you’ve built out a prototype, make sure to put it in front of actual people. It’s best if you can get the prototype in front of users who have no involvement in the product. If your budget allows, there are tools, such as usertesting.com (http://usertesting.com/), which can recruit users for you. If you lack sufficient budget, enlist family and friends. With testing, some user feedback is always better than no feedback.
User feedback can help you answer the following questions:
Testing early makes it so you have confidence that you’re making the right decisions before you start coding out a solution.
Keep the following best practices in mind while you design your voice experiences.
Use unambiguous, direct, and clear language. Direct language helps a user know that the personality of the skill they are interacting with is cooperating with them. This direct language is evident in elegant and simple syntactic structures that are crisp, easy to parse and understand.
Don’t assume that users say the exact phrase that you anticipate for an intent. While the user might say, "Plan a trip," they also might say "Plan a vacation to Hawaii." To make sure your skill can respond to a variety of user utterances, provide a wide range of sentences, phrases, and words that users are likely to say.
Avoid error messages that only say that Alexa didn't hear or understand the user correctly. For example, “I didn't hear you.” This response causes users to repeat the same phrase that caused the error. Instead, add in information that is more helpful and be as explicit in your directions as possible.