Tutorial: Build an Engaging Skill

Module 1: Why Build Alexa Skills

Welcome to module 1 of our introductory tutorial on building an engaging Alexa skill. In this module, we'll discuss why you should build an Alexa skill.

Time required: 10 - 20 minutes

What you’ll learn:

  • Why you should build Alexa skills
  • What type of Alexa skills you can create
  • How an Alexa skill works
  • The steps to build a skill
  • The requirements to build a skill

Ease of access, speed and efficiency, and skill monetization

Why Build Alexa Skills?

Ease of access

VUIs are natural, conversational, and user-centric.

A great voice experience allows for the many ways people express meaning and intent. It is rich and flexible. Because of this, building for voice isn’t the same as building graphical user interfaces (GUIs) for the web or mobile.

The easier a skill is to use, the more speed and efficiency it offers.

Speed and efficiency

Alexa skills bring speed and efficiency to mundane or habitual tasks—which is why voice is poised to become ubiquitous.

Consider the kitchen timer. With Alexa, setting a timer is as easy as saying, “Alexa, set timer for 10 minutes.” Who would have guessed pushing a few buttons on the microwave would become the less convenient option?

Skill monetization

What Type of Skill do You Want to Create?

With the Alexa Skills Kit (ASK), you can imagine an entirely custom voice experience or build a wide range of skills using our pre-built models. The Alexa Skills Kit offers pre-built interaction models which include predefined requests and utterances to help you start building quickly.

Make money selling digital content in your skill. You can sell engaging content to customers as in-skill products through a subscription, one-time purchase, or consumables.

For example, let's say you build a knowledge-sharing skill that helps teach the user a process or task. You could start with free introductory content to earn the user's trust that the skill is valuable. Then, you could sell access to premium content that is more sophisticated and valuable.

Want to Build a Custom Skill?

For custom skills, you define the interaction model. Therefore, you have flexibility and control over the skill design and code.

building a custom skill
Alexa, order a pizza. Would you like the usual?

Here are a few examples of how a user might interact with a custom skill:

  • “Alexa, order a pizza”
  • “Alexa, book a taxi"

With a custom skill, you can engage the user in a game, such as word puzzles or trivia, or just about any other action you can imagine!

As the skill builder, you:

  • Define the requests the skill can handle
  • Define the name Alexa uses to identify your skill, called the invocation name, which you will learn more about in the next module
  • Write the code to fulfill the request

Starting in Module 3, you will learn how to develop a custom skill using ASK.

turn off all the lights!

Want to Use a Pre-Built Model? See Examples Below

Smart Home Skills

Use the Smart Home Skill API to build a smart home skill with a pre-built model. This type of skill controls smart home devices such as cameras, lights, locks, thermostats, and smart TVs. The Smart Home Skill API gives you less control over a user's experience but simplifies development because you don't need to create the VUI yourself.

Invoking the skill is also very easy. A user can make requests such as the following:

"Alexa, turn on the living room lights"

"Alexa, increase the temperature by two degrees”

"Alexa, show the front door camera”

Flash Briefing Skills

Use the Flash Briefing Skill API to provide your customers with news headlines and other short content. A user can make requests such as the following:

"Alexa, give me my flash briefing”

"Alexa, tell me the news”

As the skill developer, you define the content feeds for the requested flash briefing. These feeds can contain audio content played to the user or text content read to the user.

Video Skills

Use the Video Skill API to provide video content such as TV shows and movies for users. A user can make requests such as the following:

"Alexa, play Manchester by the Sea”

"Alexa, change the TV to channel 4”

As the skill developer, you define the requests the skill can handle, such as searching for and playing video content, and how video content search results display on Alexa-enabled devices.

Music Skills

Use the Music Skill API to provide audio content such as songs, playlists, or radio stations for users. A user can make requests such as the following:

"Alexa, play some music"

"Alexa, play jazz"

This API handles the words a user can say to request and control audio content. These spoken words turn into requests that are sent to your skill. Your skill handles these requests and responds appropriately, sending back audio content for the user on an Alexa-enabled device.

Note: Currently, music skills are supported only in the United States.

These are just a few examples of pre-built skills that could help speed up your development.

How an Alexa Skill Works

The following is a simple workflow that demonstrates how Alexa works. In this example, the user invokes a simple Alexa skill called Hello World.

1. To launch the skill, the user says, "Alexa, open Hello World."

2. The Alexa-enabled device sends the utterance to the Alexa service in cloud. There, the utterance is processed via automatic speech recognition, for conversion to text, and natural language understanding to recognize the intent of the text.

3. Alexa sends a JavaScript Object Notation (JSON) request to handle the intent to an AWS Lambda function in the cloud. The Lambda function acts as the backend and executes code to handle the intent. In this case, the Lambda function returns, "Welcome to the Hello World skill."

The animation below demonstrates what happens when a user interacts with an Alexa skill. It assumes you are using AWS Lambda, serverless cloud computing, to host your skill code.

skill diagram

Watch this video to understand what happens when a user interacts with an Alexa skill:

  • The user says the wake word, Alexa.
  • Alexa hears the wake word and listens.
  • The Alexa service uses the interaction model to figure where to route the request.
  • A JSON request is sent to the skill's lambda function.
  • The lambda function inspects the JSON request.
  • The lambda function determines how to respond.
  • The lambda function sends a JSON response to the Alexa service.
  • The Alexa service receives the JSON response and converts the output text to an audio file.
  • The Alexa-enabled device receives and plays the audio.

What Are The Steps to Build a Skill?

Follow these steps to build your skill with the ASK.

Design
Step 1: Design the Voice User Interface

Begin by designing the voice interaction model of your skill. Once you start designing, you will quickly understand that designing for voice is different than designing mobile or web-based apps.  You need to think about all the different ways a user might interact with your voice skill. To provide a fluid and natural voice experience, it is important to script and then act out the different ways a user might talk to Alexa.  Also, if you have a multi-modal experience (voice and visual), you need to think of different workflows to navigate through your skill.

Build
Step 2: Build

Once your interaction model is ready, build the utterances, intents, and slots in the Alexa developer console.

The interaction model is saved in JSON format, and you can edit the model with any edit tool. After your JSON interaction model is ready, build the backend Lambda function in the AWS Management Console.

Select the programming language of your choice and the corresponding ASK software development kit (SDK), and begin coding your skill. The ASK SDK and Lambda jointly support Node.js, Python, and Java.

You can build and host most skills for free with AWS Lambda, which is free for the first one million calls per month. You can provision your own Lambda endpoint or use Alexa-hosted skills, which provisions one for you without the need to create an AWS account. Once the backend Lambda function is ready, integrate the Lambda function to your skill and test it in the Alexa developer console. Take module 3, 4 and 5 of this course to learn how to build in the console and host your skill. 

Test
Step 3: Test

The Alexa developer console has a built-in Alexa simulator, which is similar to testing on an actual Alexa-enabled device.

After testing your skill with the Alexa simulator, we recommend gathering user feedback to resolve issues and make improvements before submitting your skill for certification.

Certification and launch
Step 4: Certification and launch

After beta testing your skill, submit it for certification. Once your skill passes certification, it will be published in the Alexa Skills Store for anyone to discover and use. Start promoting it to reach more customers. 

Summary

These are the fundamental steps for building Alexa skills.

You will dive deeper into each step in subsequent modules of this tutorial.

Requirements to build a skill for this tutorial

Get ready to build by taking the following actions:

  • Sign up for an account on the Alexa developer consoleThe console is where you will build and optimize your skill. 

  • An internet-accessible endpoint for hosting your backend cloud-based service. Your backend skill code is usually a Lambda function. For this course you will create a skill with Alexa-hosted skills, where the developer console will provision a Lambda endpoint for you along with allowing you to use the Alexa Skills Kit (ASK) SDK directly on the console. Keep in mind that if you plan to use the ASK SDK, the languages supported are Node.js, Python, and Java. Alexa-hosted skills are only available in Node.js and Python. 

  • Development environment appropriate for the programming language you plan to use. Lambda natively supports Java, Go, PowerShell, Node.js, C#, Python, and Ruby and provides a runtime API, which allows you to use any additional programming languages to author your functions.

  • Publicly accessible website to host any images, audio files, or video files used in your skill. If you host your skill backend with the Alexa-hosted hosting option, an Amazon Simple Storage Service (Amazon S3) will be provisioned for you. If you use another hosting option, such as AWS Lambda, you may use Amazon S3 to host files used in your skill. If you do not have files other than a skill icon, you do not need to host any resources.

  • (Optional) Alexa-enabled device for testing. Skills work with all Alexa-enabled devices, such as the Amazon Echo, Echo Dot, Fire TV Cube, and devices that use the Alexa Voice Service (AVS). If you don't have a device, you can use the Alexa simulator in the developer console. Through the simulator, you can see the display templates for Echo Show and Echo Spot, although the display is not interactive. If your skill includes display and touch interactions, you need an Alexa-enabled device with a screen to test the skill.

Great job!

In the next module, you will learn about the design process and key concepts related to building an interaction model for a custom skill.