Understand the Music, Radio, and Podcast Skill API

The Alexa Music, Radio, and Podcast Skill API is a set of interfaces that enable selection and control of audio content streamed through an Alexa-enabled device. When you use this API to build a skill, the voice interaction model is defined and handled for you. Alexa interprets user utterances and sends messages to your skill that communicate these requests.

Introduction

The Alexa Music, Radio, and Podcast Skill API enables you to:

  • Integrate your service with Alexa so that users can play music, radio, or podcasts from your catalog on Alexa-enabled devices.
  • Integrate your service with Alexa features like setting music alarms, multi-room music, and more.
  • Ingest your music, radio, or podcast catalog for voice modeling purposes.
  • Provide metadata for audio playing from your service.
  • Subscribe to reporting capabilities.

With the Alexa Music, Radio, and Podcast Skill API, you can rely on Alexa to innovate on the core audio and voice interaction experience while you focus on onboarding and optimizing your music service for Alexa.

Who can build music, radio, and podcast skills?

Anyone can build a music, radio, or podcast skill for their own private use. To create a skill for general public use, you must submit it for certification. Additionally, to create a radio or podcast skill for general public use, you must first register for the corresponding developer preview through your Alexa Music, Radio, or Podcast representative.

Prerequisites

To create a music, radio, or podcast skill, you need the following:

  • An Amazon developer account. Sign-up is free.
  • An Amazon Alexa-enabled device, such as Amazon Echo, registered to your Amazon developer account.
  • A streaming music, radio, or podcast service with a cloud API to control it.
  • The ability to provide your music, radio, or podcast catalog metadata to Amazon on a regular basis (for example, weekly) for voice modeling and entity resolution purposes.
  • Permission to stream the content that your skill or service makes available to users.
  • An AWS account. You host your skill code as an AWS Lambda function.
  • Knowledge of one of the programming languages supported by AWS Lambda: Node.js, Java, Python, C#, or Go.
  • A basic understanding of OAuth 2.0, if your skill uses account linking.

Overview of steps to create a music, radio, or podcast skill

To create a music, radio, or podcast skill, complete the following steps. For more information about each step, see Steps to Create a Music, Radio, or Podcast Skill.

  1. Create a new music skill in the console. (Skip this step if creating your skill the by using CLI.)
  2. Create a new AWS Lambda function for your skill code.
  3. In the Lambda function, implement the functionality that your skill requires.
  4. Create a new music skill by using the CLI. (Skip this step if you created your skill in the console.)
  5. Configure account linking (optional, music and podcast skills only).
  6. Create and upload catalogs for your content.
  7. Enable the skill.
  8. Test the skill.
  9. Submit the skill for certification.

How a music, radio, or podcast skill works

An Alexa music, radio, or podcast skill system consists of the following:

User
The person who listens to a music service, radio stations, or podcasts and interacts with an Alexa-enabled device.
The Music, Radio, and Podcast Skill API
A service that understands a user's voice commands and converts them to messages that are sent to a music, radio, or podcast skill.
AWS Lambda
A compute service offered by Amazon Web Services (AWS) that hosts the music or radio skill code.
Music, Radio, or Podcast Skill
Code and configuration that interprets messages received from Alexa, and communicates with a music, radio, or podcast service cloud.
Music, Radio, or Podcast Service Cloud
Your cloud environment that manages your users and content.
Music, Radio, or Podcast Content
Audio content that's sent to Alexa for playback on an Alexa-enabled device.
Music, Radio, or Podcast Catalogs
Files that you provide to Alexa that contain information about all of the music, radio, or podcast content available through your skill.

The following example scenario explains how an Alexa music skill system works:

  1. A user enables a music skill and then says, "Alexa, play Lady Gaga on <skill name>" to their Alexa-enabled device.
  2. The Alexa-enabled device hears this utterance and sends it to the Alexa service for interpretation.
  3. The Alexa service interprets the action as "play." It composes a JSON message (a GetPlayableContent request) and sends it to the skill to determine if there is music or audio available to satisfy the user's utterance. The GetPlayableContent request includes:
    • The action ("resolve to playable content").
    • A list of resolved entities (artist, album, track, station, etc.) that were found in the music partner's catalog for that utterance.
    • An OAuth 2.0 token authenticating the user (only for skills that have enabled account linking).
  4. The skill receives and parses the request for the action, the resolved entities, and authentication details. It uses this information to communicate with the music service cloud.
  5. The skill communicates with the music service cloud to determine what audio to return to satisfy the user's utterance. The music service cloud returns a content identifier representing the audio. In this example, the identifier might represent a playlist of popular songs by Lady Gaga.
  6. The skill sends a GetPlayableContent response back to the Music Skill API indicating that the user's utterance can be satisfied, and includes the identifier for the audio.
  7. The Alexa service sends an Initiate API request to the skill, indicating that playback of the audio content should start. The skill returns an Initiate response containing the first playable track to the Alexa service.
  8. The Alexa service translates this into a response on the user's device. For example, Alexa might say, "Playing popular songs by Lady Gaga." Alexa then queues the first track on the device's media player software for immediate playback.
  9. When the first track is almost done playing on the device, the Alexa service requests the next track from the skill using a GetNextItem request. The skill returns another playable track to the Alexa service, which is sent to the user's device for playback. This process repeats until the skill, in response to a request for the next track, indicates there are no more tracks to play.