Welcome to module 1 of our introductory tutorial on building an engaging Alexa skill. In this module, we'll discuss why you should build an Alexa skill.
Time required: 10 - 20 minutes
What you’ll learn:
VUIs are natural, conversational, and user-centric.
A great voice experience allows for the many ways people express meaning and intent. It is rich and flexible. Because of this, building for voice isn’t the same as building graphical user interfaces (GUIs) for the web or mobile.
The easier a skill is to use, the more speed and efficiency it offers.
Alexa skills bring speed and efficiency to mundane or habitual tasks—which is why voice is poised to become ubiquitous.
Consider the kitchen timer. With Alexa, setting a timer is as easy as saying, “Alexa, set timer for 10 minutes.” Who would have guessed pushing a few buttons on the microwave would become the less convenient option?
With the Alexa Skills Kit (ASK), you can imagine an entirely custom voice experience or build a wide range of skills using our pre-built models. The Alexa Skills Kit offers pre-built interaction models which include predefined requests and utterances to help you start building quickly.
Make money selling digital content in your skill. You can sell engaging content to customers as in-skill products through a subscription, one-time purchase, or consumables.
For example, let's say you build a knowledge-sharing skill that helps teach the user a process or task. You could start with free introductory content to earn the user's trust that the skill is valuable. Then, you could sell access to premium content that is more sophisticated and valuable.
Here are a few examples of how a user might interact with a custom skill:
With a custom skill, you can engage the user in a game, such as word puzzles or trivia, or just about any other action you can imagine!
As the skill builder, you:
Starting in Module 3, you will learn how to develop a custom skill using ASK.
Use the Smart Home Skill API to build a smart home skill with a pre-built model. This type of skill controls smart home devices such as cameras, lights, locks, thermostats, and smart TVs. The Smart Home Skill API gives you less control over a user's experience but simplifies development because you don't need to create the VUI yourself.
Invoking the skill is also very easy. A user can make requests such as the following:
"Alexa, turn on the living room lights"
"Alexa, increase the temperature by two degrees”
"Alexa, show the front door camera”
Use the Flash Briefing Skill API to provide your customers with news headlines and other short content. A user can make requests such as the following:
"Alexa, give me my flash briefing”
"Alexa, tell me the news”
As the skill developer, you define the content feeds for the requested flash briefing. These feeds can contain audio content played to the user or text content read to the user.
Use the Video Skill API to provide video content such as TV shows and movies for users. A user can make requests such as the following:
"Alexa, play Manchester by the Sea”
"Alexa, change the TV to channel 4”
As the skill developer, you define the requests the skill can handle, such as searching for and playing video content, and how video content search results display on Alexa-enabled devices.
Use the Music Skill API to provide audio content such as songs, playlists, or radio stations for users. A user can make requests such as the following:
"Alexa, play some music"
"Alexa, play jazz"
This API handles the words a user can say to request and control audio content. These spoken words turn into requests that are sent to your skill. Your skill handles these requests and responds appropriately, sending back audio content for the user on an Alexa-enabled device.
Note: Currently, music skills are supported only in the United States.
These are just a few examples of pre-built skills that could help speed up your development.
The following is a simple workflow that demonstrates how Alexa works. In this example, the user invokes a simple Alexa skill called Hello World.
1. To launch the skill, the user says, "Alexa, open Hello World."
2. The Alexa-enabled device sends the utterance to the Alexa service in cloud. There, the utterance is processed via automatic speech recognition, for conversion to text, and natural language understanding to recognize the intent of the text.
3. Alexa sends a JavaScript Object Notation (JSON) request to handle the intent to an AWS Lambda function in the cloud. The Lambda function acts as the backend and executes code to handle the intent. In this case, the Lambda function returns, "Welcome to the Hello World skill."
The animation below demonstrates what happens when a user interacts with an Alexa skill. It assumes you are using AWS Lambda, serverless cloud computing, to host your skill code.
Watch this video to understand what happens when a user interacts with an Alexa skill:
Follow these steps to build your skill with the ASK.
Begin by designing the voice interaction model of your skill. Once you start designing, you will quickly understand that designing for voice is different than designing mobile or web-based apps. You need to think about all the different ways a user might interact with your voice skill. To provide a fluid and natural voice experience, it is important to script and then act out the different ways a user might talk to Alexa. Also, if you have a multi-modal experience (voice and visual), you need to think of different workflows to navigate through your skill.
Once your interaction model is ready, build the utterances, intents, and slots in the Alexa developer console.
The interaction model is saved in JSON format, and you can edit the model with any edit tool. After your JSON interaction model is ready, build the backend Lambda function in the AWS Management Console.
Select the programming language of your choice and the corresponding ASK software development kit (SDK), and begin coding your skill. The ASK SDK and Lambda jointly support Node.js, Python, and Java.
You can build and host most skills for free with AWS Lambda, which is free for the first one million calls per month. You can provision your own Lambda endpoint or use Alexa-hosted skills, which provisions one for you without the need to create an AWS account. Once the backend Lambda function is ready, integrate the Lambda function to your skill and test it in the Alexa developer console. Take module 3, 4 and 5 of this course to learn how to build in the console and host your skill.
The Alexa developer console has a built-in Alexa simulator, which is similar to testing on an actual Alexa-enabled device.
After testing your skill with the Alexa simulator, we recommend gathering user feedback to resolve issues and make improvements before submitting your skill for certification.
Get ready to build by taking the following actions: