This new Alexa A/B Testing service helps skill developers increase customer engagement and revenue

Arun Krishnan Jan 28, 2022

New Alexa Skills A/B testing service allows skill builders to make informed design choices.

In the third quarter of 2021, the development team at Vocala had to make a critical decision: they had to determine the ideal length of a prompt that would motivate users to subscribe to the paid version of their skill.

Vocala is a leading voice design studio based in the United Kingdom. Over the last few years, Vocala has developed Interflora, the United Kingdom’s first retail sector voice-based skill, in addition to developing a recruiting skill for the British Royal Navy. Vocala is also the publisher of the Yak Yak Games Skill that has a 4.5 rating on the Amazon UK store. Yak Yak Games is a curated collection of leading games such as Deal or No Deal, Song Blast, and Jeff Stelling's Sports Quiz. The skill has a seven-day free trial, at the end of which customers can sign up for a paid version with a monthly paid subscription.

“We wanted to test the prompt that would result in the greatest number of paid conversions,” says James Holland, lead voice developer at Vocala. “More specifically, we wanted to run an A/B test to determine whether a longer or shorter prompt would help us achieve our goal.”

A/B testing is a commonly used experimentation process to make design-related decisions. For web or mobile design, these variables could include the placement of a form on the page, or the color of a Register button. For skill builders, A/B testing can be utilized to arrive at the optimal utterances in order to achieve objectives such as reduced customer friction, increased daily repeat visits, and greater revenue.

Historically, designing and deploying A/B tests for Alexa skills has been a cumbersome process. Setting up infrastructure for an A/B test could take as long as two weeks.

“Working on multiple projects at one time, and responding to clients who want quick turnarounds, we don’t always have the bandwidth for lengthy A/B experiments,” said Holland.

As part of their efforts to continually improve Alexa’s offerings for developers, Alexa solution architects meet regularly with skill builders like Vocala. In response to recent developer feedback related to the importance of experimentation, today, Amazon is launching the Alexa Skill A/B testing service. The new service allows skill builders to design A/B experiments with the goal of maximizing in-skill purchases, repeat visits, and number of dialogs for a session.

“With the new A/B skills testing service, we were able to design and deploy an A/B test in a little under two hours,” says Holland. “We could analyze the results of our experiment through a dashboard. After a few weeks, we were clearly able to see that the longer prompt was over 15 percent more effective in driving paid conversions.”

The A/B testing service automates different facets of experimentation from customer randomization to navigating users to control and treatment versions of skills, and displaying experiment-related analytics on a dashboard.

“Ultimately, the A/B testing service allows us to get a holistic understanding of customer behavior,” says Daniel Mittendorf, who is the CTO at Beyto, a voice development agency. Mittendorf developed his first Alexa skill “Stream Player” in 2017. He developed the skill for his personal use: Stream Player allowed him to watch soccer in the kitchen while he was doing the dishes. Since its introduction, the skill has proven to be wildly successful with customers — today, it is available in nine locales in six different languages.

“We have developed two versions of Stream Player,” says Mittendorf. “The newer version is built using the Alexa Presentation Language (APL). It allows people to switch channels within a session, as opposed to an older experience, where the session is terminated after eight seconds. However, the newer version of the skill suffers from a crucial drawback. Because utterances are sent to the skill and not the device, customers cannot turn the volume up and down as easily as they could with the older version.”

Mittendorf wanted to test whether the original or APL-based experience drove higher customer retention.

“We were able to launch an A/B experiment in less than an hour,” he says. “Within three weeks we could see that the APL version of the skill drove seven percent greater engagement and also increased the retention rate. The A/B Skills Testing feature allowed us to arrive at an important decision in terms of how we should be thinking about investing our limited development resources. Going forward, we will invest our time on the APL version of the skill.”


“With the new A/B skills testing service, we were able to design and deploy an A/B test in a little under two hours. After a few weeks, we were clearly able to see that the longer prompt was over 15 percent more effective in driving paid conversions.” - James Holland, Vocala


What to Test: Engagement, Revenue, and Lots More

The Alexa Skills A/B testing service allows developers to configure experiments on the live versions of their skills easily. Skill builders can design experiments focused on metrics related to customer engagement, retention, drop-off, and monetization including:

  • Customer perceived friction rate: A metric calculated from various customer interaction patterns and other contextual signals to predict if a customer perceived friction or not.
  • In-skill Purchasing (ISP) offers: The number of times (quantity) an ISP offer is presented to a customer.
  • ISP accepts: The number of times (quantity) that a customer has accepted an offer to purchase an ISP product.
  • ISP sales: The amount of revenue generated from ISP sales.
  • ISP offer accept rate: The total number of offers accepted divided by the total number of offers delivered. Note that accepted offers are counted prior to the payment being complete and successful.
  • ISP offer purchase success rate: The total number of ISP purchases completed divided by the total number of offers delivered.
  • Skill next day retention: How often a customer uses your skill in a day or a set of consecutive days.
  • Skill dialogs: Tracks customer sessions with skills. This metric maps to one dialog per session with a skill, regardless of the number of interactions (multi turns) within a session.
  • Skill active days: The active number of days a customer is using your skill. This number is calculated from your skill's Dialog data. An active day is counted if a customer has at least one dialog on that day.

How to get started

To configure your experiment, please log in to the Alexa Developer Console, and visit the landing page for your skills. Identify the skill you want to conduct an experiment for, and click on the Certification Tab. Navigate to the A/B Testing section and click on the “Create” Link. Configure the test and click on Run to start your experiment.

View experiment-related data for the live skill version (control) and the certified skill version (treatment) in the Experiment Analytics section. 

To complete your experiment, navigate to the “Certification/AB Testing” tab, and click on Complete an Experiment for the relevant experiment. You can then select whether to dial the treatment options up or down, and provide additional configuration data.

Going forward, Holland plans to utilize the A/B testing service to inform a large variety of design decisions.

“We plan to deploy an experiment for our Royal Navy skill next,” says Holland. “More specifically, we want to understand the intents we should present to the end customer. For example, should we present the qualification intent to users or not? We didn't have a way to make informed design decisions before. Now, with the A/B Skills Testing feature, we have an easy way to develop a comprehensive understanding of customer behavior. ”

Get started by reviewing this technical documentation and visit the Alexa Developer Console to enhance your skills.

Recent Articles

5 Tips on next-leveling your game skill from the makers of award-winning LEVOOBA
Choosing an Appropriate In-Skill Product for your Alexa Skill
A new code-free way to onboard your radio station on to Alexa