About A/B Testing
A/B testing allows you to measure and compare real-time feedback from your users by simultaneously deploying two versions of the same skill, so you can experiment with certain variables and decide which version receives a better response. This process helps you make data driven launch decisions and new feature releases.
For example, you can use preconfigured test metrics to identify if a new update causes issues with a skill or you can try to increase customer engagement by testing new features.
What can I do with A/B tests?
You decide what types of tests you want to run on your skill, depending on the hypothesis you're evaluating. However, there are some limitations to the types of skill attributes you can test.
- You can run A/B tests on the following skill attributes: Anything served by your skill AWS Lambda or skill code, including APL-A data.
- You can't run A/B tests on the following skill attributes: New locale launches, invocation name changes, permission changes, account linking changes, ISP product pricing changes (such as free trial length), in-skill purchase prompts, previewed content, interaction model changes, and skill manifest changes.
The following table describes some sample tests that you could run.
Test category | Example test |
---|---|
Endpoint |
|
How A/B tests work
When a customer invokes your skill, they randomly receive one of two versions of your skill, either a control version or a treatment version.
- Control version (C) – The current experience of your live skill, before you started your test.
- Treatment version (T1) – The new experience of your skill. This is the version of the skill you're testing, which contains your updated code changes.
To make sure you receive an accurate comparison, A/B tests are conducted blind, meaning users aren't aware whether that they receive the control version or the treatment version. At the end of the test, you can choose whether to publish the treatment version available to all users or revert back to the control version.
Types of A/B tests you can run
- Endpoint-based test – You use a single version of a live skill to run your A/B test. You define your control and treatment experiences by adding conditional statements to the skill code of your live skill. These statements branch your skill into your C and T1 versions.
Eligibility criteria
To run an A/B test, your skill must meet the following eligibility criteria.
To run an endpoint-based A/B test, your skill must meet the following eligibility criteria.
- Your skill must be live.
- Your skill must use a custom voice interaction model (custom skill).
- Your skill must have a sufficient number of monthly users.
How to split your test into C and T1
Use the following instructions to separate the T1 and C behavior of your skill.
When you run an endpoint-based test, you must branch your skill code into your C and T1 versions. The following code example illustrates how you create these branches.
For step-by-step details on how to create an endpoint-based test, see Set up an Endpoint-based A/B test.
NodeJS example
const test = handlerInput.requestEnvelope.context.Experimentation.activeExperiments[0];
if (test) {
const treatment = test.treatmentId;
if (treatment == interfaces.alexa.experimentation.TreatmentId.T1) {
return handlerInput.responseBuilder.speak("treatment response")
.getResponse();
} else {
return handlerInput.responseBuilder.speak("control response")
.getResponse();
}
} else {
return handlerInput.responseBuilder.speak("not exposed to treatment")
.getResponse();
}
Lifecycle of an A/B test
As you run your A/B test, it operates in one of the following primary states: created, enabled, running, stopped, and deleted. These states dictate what actions your test can perform at a given moment time.
There are also the following secondary states, which are used to transition your test between the primary states, including: enabling, stopping, and failed.
A/B testing state diagram
You must activate the following states to complete A/B test test: create test, start test, and stop test.
The following workflow diagram illustrates this lifecycle.
Transitioning to states
You use the ASK CLI or SMAPI APIs to transition your A/B test between the following states.
- Create A/B test API – Targets the
CREATED
state. - Delete A/B test API – Targets the
DELETED
state. - Manage A/B test API – Targets the
ENABLED
,STOPPED
,RUNNING
states.
For more details about using each individual API with the corresponding states, see A/B Testing SMAPI APIs.
State details
The following tables provides specific implementation details about each state.
Primary states
State | Value | Description | Next steps |
---|---|---|---|
Create test |
Creates your test with the settings you provide.
|
Stay in this state for as long as you want to adjust your test settings.
| |
Delete test |
Deletes your test. |
Wait until ASK deletes your test. | |
Enable test |
Deploys your test settings, but doesn't start your test.
|
Stay in this state for as long as you want to QA your test.
| |
Start test |
Starts your test.
|
Stay in this state for as long as you want to run your test.
| |
Stop test |
Ends your test.
|
Stay in this state as long as you want to analyze your test metrics. |
Secondary states
State | Value | Description | Next steps |
---|---|---|---|
Enabling test |
|
Transitory state between |
Wait until your test transitions to |
Stopping test |
|
Transitory state between |
Wait until your test transitions to |
Failed test |
|
Your test didn't enable or start. |
Check your test configurations and try again. |
Related topics
Last updated: Oct 13, 2023