The Alexa Skills Kit is a collection of self-service APIs, tools, documentation and code samples that make it fast and easy for developers to add skills to Alexa. Justin Kovac, developer of 7-Minute Workout and Technical Program Manager for Alexa Skills Kit shares his experience and tips for diving head-first into building your own skills.
Prior to his current role, Justin was a Developer Advocate for multiple services across Amazon where his core responsibility was to serve as a voice of the developer community. This includes gathering community feedback to help guide initiatives and providing technical guidance to anyone seeking help via Amazon's Developer Forums and Contact Us support channels. "When I began supporting Alexa, I needed to get my bearings quickly," Justin remembers. “How can you advocate on behalf of a new developer community if you haven’t been in their shoes?”
To get started, Justin attended a hackathon – the perfect opportunity to learn the whole process, from concept to certification.
"The 7-Minute Workout skill is extremely simple in concept," Justin believes. "After some brainstorming, I remembered an iOS app I used based on a New York Times article. It worked, but it felt awkward to have my phone on the table or floor while looking for the next exercise in the routine." That's when Justin began creating a proof of concept of his skill using Node.js and AWS Lambda, an Amazon Web Service where you can run code for virtually any type of application or backend service with zero administration.
“To me, the most important benefit of 7-Minute Workout was getting hands-on knowledge of how to develop an Alexa skill, end to end. Knowing that, I was able to better support the developers who are just joining our community.”
Below Justin discusses the top seven lessons he learned while developing the 7-Minute Workout.
One of the things that the experience at the hackathon made very clear to me was the need to start with the voice experience, not the code. While skills are developed using the same tools and resources as you would use when creating an app, designing for voice feels distinctively different which makes it essential to understand VUI concepts first. The idea of triggering an action, like you traditionally would by the press of a button in an app, is now a variable of hundreds of potential values based on the customer’s request. So a customer could potentially say, “start a new workout” or “begin a workout” or “let’s exercise.” This guide is a great starting point to help you better understand Alexa Skills Kit, VUI, and how to keep users on the "happy path" when interacting with your skill via voice.
With no prior experience building an Alexa skill, I needed the ability to dive right in. What I quickly realized was that there was no need to reinvent the wheel. Amazon’s included samples provide a great variety of functional building blocks to kick start your skill, including DynamoDB integration, multi-stage conversations, RESTful request to third-party APIs and more. Personally, I used 'Wiseguy' as a starting point for the 7-Minute Workout skill because of its simplicity and intent structure. For each sample, read the overview of features and don't forget to follow the README.md files for step-by-step instructions.
When I looked at user reviews for my skill, I saw a trend in users asking for the ability to track how many workouts they completed and the ability to pick up where they left off. By default, a 'session' in an Alexa skill only lasts as long as a conversation between Alexa and the user continues. You can store custom values such as a user's response to a question across different intents by utilizing the session.attributes property of an intent.
For example:
function handleWelcomeIntent(intent, session, response) { var speechText = "Welcome to 7 minute workout. What difficulty would you like your workout? Easy, or hard?"; ... }, function handleDifficultyIntent(intent, session, response) { var speechText = ""; var difficultySlot = intent.slots.difficultyWords.value.toLowerCase(); if (difficultySlot == "easy" || difficultySlot == "hard") { session.attributes.difficulty = difficultySlot; speechText = "Great! When you are ready to begin, say ready. "; } else { speechText = "Sorry, I didn't get that. Would you like easy, or hard?"; ... } ... }, function handleReadyIntent(intent, session, response) { // make sure the value is set if (session.attributes.difficulty) { if (session.attributes.difficulty == "easy") { // easy workout } else if (session.attributes.difficulty == "hard") { // hard workout } } else { ... }
Keep in mind though, that a session can close for a variety of reasons -- user's saying 'stop,' users not responding, providing a response with shouldEndSession = true. To persist across these sessions, you should use a database service such as Amazon's DynamoDB, to store necessary information to access at a later time such as the total number of completed workouts for the month. This is how I was able to give users the ability to pick up where they left off and track workouts.
Cards are a great way to educate and provide detailed information about your skill to your users. The markup text that generates Alexa speech, also known as SSML, can potentially show up on your card. In the case of the 7-Minute Workout skill, I found the SSML content was wrapped in a variety of tags including pauses, breaks, and phonemes. I used the below simple regular expression to remove the markup text from my card. For example:
var regex = /(<([^>]+)>)/ig; var speechText = "Get ready. <break time=\"0.2s\" /> Three.<break time=\"0.5s\" /> Two.<break time=\"0.5s\" /> One.<break time=\"0.5s\" />"; // replace tags with empty characters var cardText = speechText.replace(regex, "");
When I was thinking through the user experience of my skill, I wanted to make sure other words wouldn’t be confused with the words needed to launch my skill. For example, I wanted to make sure to account for situations where Alexa may have heard a user say something like “Becky” instead of “ready”. I used custom slots, or parameters, to ensure the words would trigger the correct Alexa interaction. I found it easier to convert all incoming values to lowercase in order to make sure the text coming through matched the text in my code. For example:
var workoutWords = ["ready", "go", "begin", "start", "next"]; var startWorkoutSlot = intent.slots.StartWorkoutWords.value.toLowerCase(); // check if spoken slot word is in the workoutWords array if (workoutWords.indexOf(startWorkoutSlot) > -1) { // success }
When I submitted my skill for certification the first time, it failed because I didn’t have example phrases in the sample utterances. My advice would be: don’t get discouraged. The certification process puts skills through a series of functional, security, and policy checks to ensure users have good experiences with skills. The Certification Submission Checklist is a great way to help you catch any common 'gotchas,' such as the requirements for help, stop, and cancel intents.
I found users were interacting with my skill in a variety of ways. They said phrases like “let’s exercise” or “begin working out,” both of which are intended to start a workout. To provide a good user experience, it was essential that Alexa understand the intent of the user. Because of this, I had family, friends, and colleagues test my skill to better understand and account for the countless ways users interact with the skill. In my experience, testing outside the box helped me gain a deeper understanding of VUI.
Justin’s 7-Minute Workout skill helps thousands of regular users with their workout regimens. If you have an Alexa-enabled device, you can try it out. Just enable the “7-Minute Workout” skill in the Alexa app, say “Alexa start 7-Minute Workout” and get fit.
To jump in and get started with the Alexa Skills Kit, check out the following resources:
Share other innovative ways you’re using Alexa in your life. Tweet us @alexadevs with hashtag #AlexaDevStory.