A/B Testing Metric and Configuration Attributes

Note: Sign in to the developer console to build or publish your skill.

Configuration attributes and metrics are the building blocks to an A/B test. They help you define the purpose of your test and track the outcome of your experiments. Using the correct configuration attributes and metrics is critical to running a useful test.

Configuration attributes

Configuration Attributes	Definition
Control	The group of customers in your A/B test that continue to receive your current skill experience. You use the metrics captured from this group to establish a baseline for your test.
Guardrail Metrics	Metrics that you set up to track and detect unexpected regressions caused by your new treatment experience. For more details about the metrics you can use as guardrails, see Metrics. To track your guardrail metrics, view the metrics on the Analytics tab. The A/B test doesn't send alerts based on changes in guardrail metrics.
Hypothesis	An assumption you make before your A/B test starts, with the goal of predicting or defining the outcome of your test. You only use this field to document the purpose for their A/B Test and it doesn't impact the test outcome.
Key Metrics	Metrics that you set up to track and detect expected changes caused by your new treatment experience. These metrics should help you determine if your A/B test benefits your skill. For more details about the metrics you can use as a key metric see Metrics.
Traffic Exposure	The number of customers that have enabled your skill and can participate in your A/B test. For example, if you have 100 total customers and you set your Traffic Exposure to 40 percent, you're including 40 customers in your test. In this case, your test includes 20 customers in your C group and 20 customers in your T1 group. The remaining 60 customers aren't included in the test and receive the default behavior equivalent to C, however, they don't contribute to the test metrics.
Treatment	The group of customers that receive your new skill experience when your test is running.
P-Value	The probability of seeing a particular result (or more extreme) from zero, assuming that the null hypothesis is TRUE.
User Count	The number of users included in the skill version you're testing.
Percent Diff	The relative percent difference between the mean of T1 group and the C group.
Confidence Interval	A way of presenting the uncertainty associated with a given measurement of a parameter of interest.

Metrics

You can designate any of the following metrics as either a key metric or a guardrail metric.

You should select one to three key metrics which track changes in your customer behavior, as they relate to your hypothesis. For example, if your hypothesis states that you might increase customer subscriptions by changing the location of your ISP upsell messaging, than you might select the following as key metrics: ISP : Offer Accept Rate and ISP : OPS.

For an example of how to use these metrics in a test, see Set up an Endpoint-based A/B test.

Metric Name	Description
Customer Friction	A metric calculated from various customer interaction patterns and other contextual signals to predict if a customer perceived friction or not.
ISP : OPS	The amount of revenue generated from ISP sales.
ISP : Sales	The number of sales (quantity) generated from ISP offers.
ISP : Offer Accept Rate	The total number of offers accepted divided by the total number of offers delivered. Note that accepted offers are counted prior to the payment being complete and successful.
ISP : Offer to Purchase Conversion	The total number of ISP purchases completed divided by the total number of offers delivered.
Skill Next Day Retention	How often a customer uses your skill in a day or a set of consecutive days.
Skill Utterances	Tracks customer sessions with skills. This metric maps to one dialog per session with a skill, regardless of the number of interactions (multi turns) within a session.
Skill Active Days	The active number of days a customer is using your skill. This number is calculated from your skill's Dialog data. An active day is counted if a customer has at least one dialog on that day.

Was this page helpful?

Provide feedback

Last updated: Oct 13, 2023

A/B Testing Metric and Configuration Attributes

Configuration attributes

Metrics

Related topics

Was this page helpful?