By Sam Morgan, Head of Education at Makers Academy
Editor’s note: This is part one of our Makers Academy series for Ruby developers. Learn more about this free training on the Alexa Skills Kit in this blog post.
Welcome to the first module of Makers Academy's short course on building Alexa skills using Ruby. Amazon's Alexa Skills Kit allows developers to extend existing applications with deep voice integration and construct entirely new applications that leverage the cutting-edge voice-controlled technology.
This course will cover all the terminology and techniques required to get fully-functional skills pushed live to owners of Alexa-enabled devices all around the world using Ruby and Sinatra.
What's in This Module?
This module contains a basic introduction to scaffolding a skill and interacting with Alexa. This module introduces:
- Intent schemas
- Alexa communication paradigm
- Tunneling a local application using ngrok over HTTPS
- Connecting Alexa to a local development environment
- Alexa-style JSON requests and responses
During this module, you will construct a simple skill called “Hello World.” While building this skill, you will come to understand how the above concepts work and play together. This module uses:
- Ruby's JSON library
Let's get started!
1. Amazon-side Setup: Setting up the Voice User Interface (VUI)
Our first step is to set up the skill on Amazon.
- Click “Add a new skill”
- Use a “default custom interaction model”
- Set up the skill:
- Name (“Hello World”)
- Invocation Name (“Hello World”)
The invocation name is used by the user to access a certain skill. For example, "Alexa, ask Hello World to say hello world."
Now we have a new skill, let's construct the intent schema.
The intent schema lists all the possible requests Amazon can make to your application.
The minimal intent schema is a JSON object with a single property: intents. This property lists all the actions an Alexa skill can take. Each action is a JSON object with a single property: intent. The intent property gives the name of the intent.
Now that we have the intent schema, let's make the utterances. Utterances map intents to phrases spoken by the user. They are written in the following form:
In our case, we have only one Intent: HelloWorld, and we'd like the user to say the following:
Alexa, ask Hello World to say hello world.
Our utterances are:
HelloWorld say hello world
We've now set up our skill on Amazon's Alexa Developer Portal.
2. Setting up the Backend: A local Tunneled Development Environment
Our second step is to set up our local Ruby application to be ready to receive encrypted requests from Amazon’s servers (i.e. HTTP requests over SSL or “HTTPS” requests).
We will walk through setting up a Ruby server using Sinatra. The server will run locally and be able to receive HTTPS requests through a tunnel.
Alternatively, you could set up a remote development server using Heroku (with Heroku SSL), Amazon Elastic Beanstalk (with a self-signed SSL certificate), or any other method you can think of.
We’re going to use ngrok to tunnel to a local development server.
Setting Up a Sinatra Application
- Make the directory with mkdir hello_world_app
- Head into the directory with cd hello_world_app
- Set up a Ruby application with bundle init (you may need to install Bundler with gem install bundler first)
- Add the Sinatra gem to your Gemfile by adding the line gem 'sinatra'
- Install Sinatra to your project using bundle install
- Create a server file with touch server.rb
- For now, create a single POST route, '/', that prints out the request body we are going to receive from Amazon:
# inside server.rb
post '/' do
Opening Your Sinatra Application to the Internet Using Ngrok
- Download the appropriate ngrok package for your Operating System from the ngrok downloads page
- Unzip the package and transfer the executable to your hello_world_app directory
- Start ngrok using ./ngrok http 4567
- Copy to the clipboard (command-C) the URL starting with “https” and ending with “.ngrok.io” from your ngrok terminal
- In a second terminal, start your Sinatra application using ruby server.rb.
3. Linking the Alexa VUI to Our Backend via the Endpoint
Our third step is to link the skill we set up on Amazon (1) with the tunnel endpoint (2) so our skill can send requests to our local application.
Configuring the Endpoint in the Alexa Skills Portal
When Amazon invokes an intent, Amazon sends a
POST request to the specified endpoint (web address).
Head back to your Alexa skill (for which you just entered intents and utterances). Hit “Next,” then set up the endpoint.
- Use HTTPS, not AWS Lambda (there is currently no Ruby support on Lambda)
- Geographical region: Europe (we picked Europe because Makers Academy is in the UK, but you would select North America if you’re in the United States)
- Paste the endpoint to your application into the text input field
If using ngrok, your endpoint is the URL you copied, starting with "https" and ending with ".ngrok.io."
- You won't need account linking for this skill.
Amazon Alexa only sends requests to secure endpoints: ones secured using an SSL certificate (denoted by the 'S' in HTTPS). Since we used ngrok to set up our HTTPS endpoint, we can use ngrok's wildcard certificate instead of providing our own.
- If you used ngrok to set up a tunnel, select “My development endpoint is a sub-domain of a domain that has a wildcard certificate from a certificate authority.”
- Hit “Next” again.
Testing in the Service Simulator
The Service Simulator in the Amazon Alexa Developer Portal allows you to try out utterances. Once you’ve written an utterance into the Service Simulator, you can send test requests to the application endpoint you defined. You can see your application’s response to each request that you send.
- Use the Service Simulator to test that the say hello world utterance causes Amazon to send an intent request to your local application, and observe that the request body printed to the command-line matches the JSON request sent in the Service Simulator.
- You will receive an error in the Service Simulator as you’re not sending a response to this request just yet. This is not a problem for now; check your logs to view the request that was sent.
You’ve now hooked up your local development environment to an Alexa skill.
4. Responding to Alexa Requests
Now we have set up an Alexa skill (1), built a local development server with an endpoint tunnelled via HTTPS (2), and can make requests from Amazon to our local development server through that endpoint (3).
Our final step is to construct a response from our endpoint such that Amazon can interpret the response to make Alexa say, “Hello, world” to us.
Building the JSON Response
Amazon sends and receives JSON responses in a particular format. Let's set that up here.
# inside server.rb
post '/' do
text: "Hello World"
There are a few parts to this JSON response object:
- version (string): required. Allows you to version your responses.
- response (object): required. Tells Alexa how to respond: including speech, cards, and prompts for more information.
- outputSpeech (object). Tells Alexa what to say.
- type (string) required. Tells Alexa to use Plain Text speech, where Alexa will guess pronunciation, or Speech Synthesis Markup Language (SSML), where you can specify pronunciation very tightly.
- EXTRA CREDIT: Change the response to use custom pronunciation using SSML.
- text (string) required. Tells Alexa exactly what to respond with.
- EXTRA CREDIT: Play around with this response, restarting the server and sending an Intent Request from the Service Simulator each time.
Testing Our Response in the Service Simulator...and Beyond!
Now that we've built a JSON response, we can restart the server and test out the new response in the Service Simulator.
If you would like to try your new Hello World skill out live, ask Hello World to say, “Hello, World” on any Alexa-enabled device registered to your developer account. You can also try it out on the browser-based Alexa skill testing tool Echosim.io.
Ready for the next module? Learn how to build a fact-checking skill with slots and custom slot types.
Build a Skill, Get a Shirt
The Alexa Skills Kit (ASK) enables developers to build capabilities, called skills, for Alexa. ASK is a collection of self-service APIs, documentation, templates, and code samples that make it fast and easy for anyone to add skills to Alexa.
Developers have built more than 10,000 skills with ASK. Explore the stories behind some of these innovations, then start building your own skill. Once you publish your skill, mark the occasion with a free, limited-edition Alexa dev shirt. Quantities are limited.