Mix Alexa speech, sound effects, music, and other audio at runtime with Alexa Presentation Language (APL) for audio (beta).

Alexa Skills Kit    >    Get Deeper    >    Response APIs    >    Multimodal

Multimodal Alexa Skills

How Multimodal Skills Work

Adding rich audio, visuals and touch can make your voice experience more engaging and easy to use. The Alexa Presentation Language (APL) is a design language that enables you to create skills with rich audio and visuals and to adapt them for different device types such as the Echo Show devices, Fire TVs, LG TVs, and Lenovo Smart Tab devices.

Alexa devices
On this page:

Key Features    |    Case Studies    |     Example Skills    |    Get Started    |    Related Content

Key Features of Multimodal Skills

Custom Skills

Multimodal skills are a category of custom skills. Use custom skills to deliver nearly any use case.

APL Authoring Tool

Use the APL authoring tool to build and preview how your visual content is displayed on the screen of a device.

Data Binding & Data Sources

Use data binding to retrieve data from a separate data source that you provide.


Change the visual experience during runtime or communicate with your skill’s Lambda function or web service during the interaction with APL commands.

Directives & Requests

Communicate APL-related information with your Lambda function or web service with APL directives and requests.

Alexa Design System for APL

Use pre-built responsive components, templates, styles, and resources to build your visual content.


Case Studies about Multimodal Skills

Big Sky Skill Enhanced with APL

Big Sky is a weather skill that delivers personalized, hyper-local weather information. Steven Arkonovich uses Alexa Presentation Language to present a simple yet striking set of visuals to complement the voice experience.

bondad.fm Is “All in for Voice”

With zero coding experience, John Gillilan never expected to find life-changing success building Alexa skills. But today, Gillilan heads up bondad.fm, a thriving voice design consultancy that creates highly engaging multimodal skills for Alexa.

Ben Ursu Spins the Fork On The Road

For years, professional developer Ben Ursu has built technology that brings immersive visual and augmented reality experiences to life. But Ursu was curious how he might integrate such powerful visual interfaces with another engaging technology: voice. 


Example Multimodal Skills

Official Harry Potter Quiz
Ticket to Ride
Food Network Kitchen

Get Started with Multimodal Skills

Ready to build? Get started with our tutorial about building multimodal skills, then start building your skill with the Alexa Skills Kit.

Content Related to Multimodal Skills