Understand the Alexa Web API for Games

You can use the Alexa Web API for Games to build web-based games that users can play on Alexa devices. This API lets you build in voice commands, so users can interact with your app with both touch and voice.

Build with the Alexa Web API for Games

To create a game with the Alexa Web API for Games, you build two parts:

  • An Alexa custom skill that starts up the web app and handles voice requests. Your skill code uses the Alexa.Presentation.HTML interface to start the web app and then communicate with the app while it runs.
  • A web app that can display in a browser and interact with users. Your web app code uses the Alexa JavaScript API to communicate with the skill while the app runs.

As you plan and build your skill, note that you can build these pieces independently and then add the integration later. For example, you can create a fully-functional web app for your game, then add in the calls for Alexa speech and voice control later.

How users interact with the skill and web app

Users invoke and interact with your skill normally. Your skill starts the web app part of the experience at an appropriate point in the interaction, such as in the response to an IntentRequest. For example, you might design your skill to ask the user if they are ready to start:

User: Alexa, Open My Web Game
Alexa: Welcome to My Web Game. Are you ready to begin?
User: Yes
Alexa: OK, starting the engines… (sound effects)
Skill launches the web app and displays the starting page.

After the web app starts, the user can interact with the app using both touch (or remote) and voice. For example, the user can:

  • Press buttons on the screen. These actions might trigger Alexa speech or other actions within your web app.
  • Speak commands to the game (for example: "Alexa, fire the missiles!")
  • Respond to voice prompts triggered by your app (for example: "That was a miss! Do you want to try again?").

This continues until the web app closes, such as when the user says "Alexa, exit".

Communication between your skill and your web app

While the user interacts with your web app, you use the Alexa.Presentation.HTML interface and Alexa JavaScript API for communication between the skill and the web app.

  • The Alexa.Presentation.HTML interface provides your skill with directives and requests for communicating with the web app. Your skill returns directives to send information to the web app, and uses request handlers that can listen for events coming from the web app.
  • The Alexa JavaScript API provides your web app with an Alexa class with methods and properties for communicating with the skill. Your web app calls methods on this class to send messages to the skill, and registers callbacks to listen for events coming from the skill.

The flow looks like this:

  1. The user invokes your skill and triggers the request to start the web app. Your skill gets a normal LaunchRequest or IntentRequest.
  2. The request handler in your skill code returns a response with the Alexa.Presentation.HTML.Start directive. This tells Alexa to start the web app.
  3. The device launches your web app.
  4. Your web app includes code to reference the Alexa JavaScript API and calls 'Alexa.create()' to create an instance of the Client class.
  5. The user begins interacting with the web app normally.

Once the game is running, communication flows between the skill and the app:

  • The web app calls alexa.skill.sendMessage() to send messages to the skill, such as an event indicating that Alexa should prompt the user for voice input. Each message can include data you define to represent what's happening in the web app. You can also provide an optional callback to get a status code indicating whether the sendMessage() call was successful.
  • These messages from the app are sent to the skill in the Alexa.Presentation.HTML.Message request. The skill handles these incoming requests and responds with directives and output speech.
  • The skill returns the Alexa.Presentation.HTML.HandleMessage directive to send messages to the app.
  • The web app registers a callback to respond to incoming messages sent from the skill (alexa.skill.onMessage()). Note that the app can only register one callback, so the callback function should include logic to handle different messages based on the data provided within the message.

During this interaction, the skill session remains open. Also, the Alexa JavaScript API lets your web app register additional handlers to listen for other events, such as when Alexa begins and ends speech. You can use these to build the flow between on-screen events such as button presses and voice commands.

For examples of how you use communication between the skill and web app to build in voice control and Alexa speech responses, see Add Voice Control and Speech to the Web App.

The web app and skill session

The web app changes the normal skill session lifecycle. See the following sections for details:

Start the web app and keep the session open

When your skill sends the directive to start the web app, the response must include shouldEndSession set to either false or undefined (not set). This keeps the skill session open so the user can interact with the web app on the screen.

  • When shouldEndSession is false, Alexa speaks the provided outputSpeech and then opens the microphone for a few seconds to get the user's response.
  • When shouldEndSession is undefined, Alexa speaks any provided outputSpeech, but does not open the microphone. The session remains open.

The session remains open as long as the web app is active on the screen. This is different from a normal skill with screen content, such as a skill built with Alexa Presentation Language or the display templates. For those skills, the session can remain open for approximately 30 seconds.

What closes the web app?

Once the device displays your web app, your app remains active on the screen. The following actions close the app:

  • The skill session ends.
  • The skill returns a directive from an interface other than Alexa.Presentation.HTML. This closes the web app, but doesn't necessarily close the skill session.

For example, the skill returns an Alexa.Presentation.APL.RenderDocument. In this case, the device closes the web app and inflates the provided document. The skill session then has the lifecycle described in How devices with screens affect the skill session until the skill sends another Alexa.Presentation.HTML.Start directive to restart the web app.

What ends skill the session?

Any of the following actions can end the session:

  • Your skill returns a response with shouldEndSession set to true.
  • The user ends the skill with "Alexa, exit".
  • The user exits the skill with "Alexa, go home".
  • The user stops interacting with the web app and leaves it idle. After the duration of the configurable timeout (up to 30 minutes), the skill session will end. Specific devices might choose to ignore the configured timeout value, or set a lower bound.

When the skill session ends, this also closes the web app.

Can I use both the Alexa Web API for Games and Alexa Presentation Language in the same skill?

You can use both the Alexa Web API for Games and Alexa Presentation Language in your game. However, you cannot mix the two in a single screen. For a given response, you can display either the web app or an APL document.

When your web app is already displayed on the screen, sending the Alexa.Presentation.APL.RenderDocument or ExecuteCommands directive closes the app. Be sure to save any state information as needed for your game.

To use APL in your skill, be sure to configure your skill to support the APL interface (ALEXA_PRESENTATION_APL in the skill manifest).

Requirements for the skill and web app

To use the Alexa Web API for Games, your skill and your web app must meet the following requirements.

  • Configure your skill to support the Alexa Presentation HTML Interface.
  • Host your web app at an internet-accessible HTTPS endpoint. The web server must present a valid and trusted SSL certificate.
  • The web app must be a game. Other types of apps can't use the Alexa Web API for Games.
  • Use the Alexa JavaScript API in your web app to communicate with your skill.

Test your skill and the web app

You can use Chrome DevTools on your development machine to debug your skill's web app that's running on an Amazon Fire TV device. For more information, see Using Web App Tester and DevTools.

Use of HTML game engines and relevant libraries

With Web API for Games, you can use various techniques and tools to manage graphics and advanced audio. You can do animation with HTML and CSS, Canvas, SVG, and WebGL. Developers have numerous HTML game frameworks and libraries to choose from. Here are a few popular options.

  • Phaser – Game Framework for Canvas and WebGL
  • Pixi.js – Game Framework and 2D graphics using WebGL
  • three.js – 3D library and app framework using WebGL
  • Howler.js – Audio Library