Understand Custom Skills

Note: Sign in to the developer console to build or publish your skill.

This document provides a high-level overview of building a custom skill.

For step-by-step guide, see Steps to Build a Custom Skill.
For samples and templates, see the alexa-samples organization on GitHub.

To learn about other Alexa skill options and determine if a custom skill is the right skill type for what you want to build, see About Voice Interaction Models and Index of Skill Types. Also, make sure that the skill you plan to build complies with all the Alexa Skills Kit content policies.

Components of a custom skill

When designing and building a custom skill, you create the following:

A set of intents that represent actions that users can do with your skill. These intents represent the core functionality for your skill.
A set of sample utterances that specify the words and phrases users can say to invoke those intents. You map these utterances to your intents. This mapping forms the interaction model for the skill.
An invocation name that identifies the skill. The user includes this name when initiating a conversation with your skill.
If applicable, a set of images, audio files, and video files that you want to include in the skill. These must be stored on a publicly accessible site so that each item is accessible by a unique URL.
A cloud-based service that accepts these intents as structured requests and then acts upon them. This service must be accessible over the Internet. You provide an endpoint for your service when configuring the skill.
A configuration that brings all of the above together so that Alexa can route requests to the service for your skill. You create this configuration in the developer console.

For example, a skill for getting tide information might define an intent called OneshotTideIntent to represent the user's request to look up tide information for a particular coastal city.

This intent would be mapped to several sample utterances such as:

OneshotTideIntent get high tide
OneshotTideIntent get high tide for {City}
OneshotTideIntent tide information for {City}
OneshotTideIntent when is high tide in {City}
...
(many more sample utterances)

A user would say something like:

User: Alexa, get high tide for Seattle from Tide Pooler
(In this example, the italicized words form the sample utterance you have defined, while the invocation name is shown in bold).

Speaking this to an Alexa-enabled device does the following:

The user's speech is streamed to the Alexa service in the cloud.
Alexa recognizes that this request represents the OneshotTideIntent intent for the "Tide Pooler" skill.
Alexa structures this information into a request (specifically an IntentRequest in this example) and sends this request to the service defined for Tide Pooler. The request includes the value "seattle" as the "City".
The Tide Pooler service gets the request and takes an appropriate action, such as looking up tide information for the current date in Seattle from https://tidesandcurrents.noaa.gov/.
Tide Pooler sends the Alexa service a structured response with the text to speak to the user.
The Alexa-enabled device speaks the response back to the user:

Tide Pooler: Today in Seattle, the first high tide will be around 1:42 in the morning, and will peak at about 10 feet…

Conduct a conversation with the user

A custom skill typically gets a question or other information from the user and then replies with an answer or some action, such as ordering a car or a pizza. Users can invoke your skill by using your invocation name in combination with sample utterances and phrases defined by Alexa:

Alexa, Get high tide for seattle from Tide Pooler
Alexa, Ask Recipes how do I make an omelet?
Alexa, Ask Daily Horoscopes about Taurus
Alexa, Give ten points to Stephen using Score Keeper

Users can also start interacting with a skill without providing any specific question or request:

Alexa, Open Tide Pooler
Alexa, Talk to Recipes
Alexa, Play Quick Trivia
Alexa, Start Score Keeper

Users may use this option if they don't know or can't remember the exact request they want to make. In this case, the skill normally returns a welcome message that provides users brief help on how to use the skill.

In the above examples, the bolded words are defined by the Alexa service, while the italicized words are sample utterances defined for the skill.

If your skill needs more information to complete a request, you can have a back-and-forth conversation with the user:

User: Alexa, get high tide from Tide Pooler (Although 'get high tide' maps to the OneShotTideIntent, the user didn't specify the city. Tide Pooler needs to collect this information to continue.)

Tide Pooler: Tide information for what city? (Alexa is now listening for the user's response. For a device with a light ring, like an Amazon Echo, the device lights up to give a visual cue)
User: Seattle

Tide Pooler: Today in Seattle, the first high tide will be at…
Interaction ends.

To learn more, see the following references:

Provide a visual component for your skill

The Alexa app is a free companion app available for Fire OS, Android, iOS, and desktop web browsers. This app is relevant to your custom skill in two ways:

The app displays skill detail cards for all published skills. Users review these cards to learn what your skill does and how to use it when deciding whether to enable your skill. You facilitate this by providing useful information about your skill when preparing it for publishing.
The app displays home cards that describe or enhance the user's voice interactions with Alexa. Users can view these cards later, to get more information about the interaction or refresh their memory about Alexa's response.

Your skill can include content for these cards in your responses.

For example, the Tide Pooler skill sends a home card containing the tide information the user asked for. The home card gives the user a way to view the tide information without making another voice request.

You can also use Alexa Presentation Language (APL) to create a skill that incorporates voice, screen, and touch interactions for Alexa-enabled devices with a screen.

To learn more, see the following references:

Collect the images, audio files, and video files for use in your skill

Your skill can use audio files with AudioPlayer. Skills designed for an Alexa-enabled device with a screen can also use images and video files.

Any such external resources must be available on a publicly accessible website. Each item is referenced by a unique URL that uses HTTPS.

Host the cloud-based service for your skill

You can host your service in AWS Lambda or as a web service hosted on your own endpoint.

AWS Lambda (an Amazon Web Services offering) is a service that lets you run code in the cloud without managing servers. Alexa sends your Lambda function user requests and your code can inspect the request, take any necessary actions (such as looking up information online) and then send back a response. You can write Lambda functions in Node.js, Java, Python, C#, Go, Ruby, or PowerShell. This is generally the easiest way to host the service for a skill.

Alternatively, you can write a web service and host it with any cloud hosting provider. The web service must accept requests over HTTPS. Here, Alexa sends requests to your web service and your service takes any necessary actions and then sends back a response. You can write your web service in any language.

To learn more about implementing a Lambda function for a skill, see Host a Custom Skill as a Lambda Function.

To learn more about implementing a web service for a skill, see Host a Custom Skill as a Web Service.

Sample code

The following sample code demonstrates a basic Hello World skill and other features of the ASK SDK:

Was this page helpful?

Provide feedback

Last updated: Oct 09, 2025