Creating an Alexa skill is a lot like cooking a delicious meal. There are several ingredients, and recipes are based on your taste and preferences. The Alexa Skill-Building Cookbook on GitHub gives you the ingredients and recipes you need to build engaging Alexa skills, using short code samples and guidance for adding features to your skill. With each installment of the Alexa skill recipe series, we’ll introduce you to a new recipe that can help you improve your skill design and customer engagement. You’re the chef. Now, let’s get cooking.
Speech is an elementary and fundamental mode of interaction. When we speak to others, and they respond, we're engaging in dialog. More definitively, dialog is the exchange of speech between two or more people. Dialog itself has many nuanced properties. For instance, dialogs happen at an unwritten, yet innate cadence—one person speaks, there's a pause, then the other person speaks. It's no fun when one person monopolizes a dialog without allowing the other to speak. The inverse, when you speak and no one responds, is equally frustrating. Imagine trying to have an engaging conversation with a brick wall (unless the brick wall was Alexa-enabled, in which case I digress). In fact, you could argue that neither of these two scenarios are a conversation at all!
A quick and informative response is the baseline expectation that people have when interacting through conversation. That's why voice designers have to keep this principle pretty close to the vest. In this post, we share how you can improve your skill's customer experience with a recipe that addresses this principle element of conversation: responsiveness.
It’s simple enough to make an HTTP request from your Alexa skill. Though one of the real challenges that arises when connecting to live services is endpoint latency. A customer utterance followed by silence from Alexa can conflict with their expectation of natural dialog. This could ultimately impact both the reception and adoption of your skill.
The latency problem becomes especially apparent when you’re trying to fetch heavy data like a high quality image, videos, or if you’re executing a computationally expensive REST query that could take a couple seconds to complete (e.g. searching a catalog for a particular record).
That's why informing your customers that they may have to wait a bit while your skill is computing is a good idea. Imagine if you were speaking with your friend on the phone and they asked you a question. If you had to look up the answer, the polite thing to do would be to ask your friend to wait or “hold on a moment.” This is also why progress bars, load screens, and spinning wheels are commonly used in graphical user interfaces to indicate a computer has received instruction and is taking some time to compute.
By setting a fair expectation, we improve the perception of speediness in our customers and thereby decrease the skills perceived latency without actually having to do one bit of performance engineering.
To achieve a similar effect with Alexa, we can use the Progressive Response API. Instead of remaining silent while your skill is processing a request, we can let the customer know that the skill is working. It's essentially an auditory loading indicator.
Let's make this easy on us by creating a separate helper function to abstract the functionality.
The first thing we need to do is instantiate a generic Alexa Directive Service. Then we need to extract a requestId, apiEndpoint, and an apiAccessToken from the Skill's event object. The requestId is a unique identifier for a specific request sent from Alexa to your skill. The apiAccessToken is provided in the context object and the token is also included in all requests sent to your skill. Finally the apiEndpoint is the endpoint for the Progressive Response API. This will vary depending on the geographic location of your skill. We can see this in action in lines 2 through 7 below.
1. function callDirectiveService(event) {
2. // Instantiate Alexa Directive Service
3. const ds = new Alexa.services.DirectiveService();
4. // Extract Variables
5. const requestId = event.request.requestId;
6. const endpoint = event.context.System.apiEndpoint;
7. const token = event.context.System.apiAccessToken;
8. // Instantiate Progressive Response Directive
9. const directive = new Alexa.directives.VoicePlayerSpeakDirective(requestId, 'Please wait...');
10. // Store functions as data in queue
11. return ds.enqueue(directive, endpoint, token);
12. };
Lines 8 through 11 above is the real magic sauce for this recipe. Calling this service sends a specified VoicePlayer speak directive to Alexa with the interstitial content that Alexa will say. The functions are stored as data in a queue so that when called, they proceed sequentially.
The final code that executes the helper function above look something like this:
1. const directiveServiceCall = callDirectiveService(this.event);
2. const getHttpRequest = getHttpRequest('http://some-endpoint-here.com');
3.
4. Promise.all([directiveServiceCall, getHttpRequest])
5. .then((data) => {
6.
7. let speechoutput = 'The answer from your api call is ' + data;
8. this.response
9. .speak(speechOutput)
10. this.emit(':responseReady');
11. });
Take a look at line 4: we have to make sure that the directive with the interstitial content, or the content we’ll present while waiting, is provided before the HTTP request is made. Since Node.js is inherently asynchronous, we achieve this by wrapping our API calls in a promise handler to make them synchronous. We've provided the promise handler an ordered array of functions to call ([directiveServiceCall, getHttpRequest]). If you're curious to learn a bit more about promises in Node.js, a quick web search will yield plenty of results. You can also reference one of our earlier recipes on making HTTP requests here.
While your user is waiting, you can serve them short audio assets (like sound effects) or other dynamic content that will make the wait time more delightful. Though, setting the expectation that a customer may have to wait a moment is one thing, but it may not totally offset their frustration. There is a strategy we can employ to speed up how content is fetched from APIs. And though it is a performance engineering practice, it's quick, low-cost, and an easy to deploy solution.
For more recipes, visit the Alexa Skill-Building Cookbook on GitHub.
You can make money through Alexa skills using in-skill purchasing or Amazon Pay for Alexa Skills. You can also make money for eligible skills that drive some of the highest customer engagement with Alexa Developer Rewards. Download our guide to learn which product best meets your needs.
Bring your big idea to life with Alexa and earn perks through our tiered rewards system. Publish a skill in May and receive an Alexa backpack. If 1,000 customers use your skill in its first 30 days in the Alexa Skills Store, you can also earn a free Echo Plus. Learn more about our promotion and start building today.