A/B Testing Metric and Configuration Attributes


Configuration attributes and metrics are the building blocks to an A/B test. They help you define the purpose of your test and track the outcome of your experiments. Using the correct configuration attributes and metrics is critical to running a useful test.

Configuration attributes

Configuration Attributes Definition

Control

The group of customers in your A/B test that continue to receive your current skill experience. You use the metrics captured from this group to establish a baseline for your test.

Guardrail Metrics

Metrics that you set up to track and detect unexpected regressions caused by your new treatment experience. For more details about the metrics you can use as guardrails, see Metrics. To track your guardrail metrics, view the metrics on the Analytics tab. The A/B test doesn't send alerts based on changes in guardrail metrics.

Hypothesis

An assumption you make before your A/B test starts, with the goal of predicting or defining the outcome of your test. You only use this field to document the purpose for their A/B Test and it doesn't impact the test outcome.

Key Metrics

Metrics that you set up to track and detect expected changes caused by your new treatment experience. These metrics should help you determine if your A/B test benefits your skill. For more details about the metrics you can use as a key metric see Metrics.

Traffic Exposure

The number of customers that have enabled your skill and can participate in your A/B test. For example, if you have 100 total customers and you set your Traffic Exposure to 40 percent, you're including 40 customers in your test. In this case, your test includes 20 customers in your C group and 20 customers in your T1 group. The remaining 60 customers aren't included in the test and receive the default behavior equivalent to C, however, they don't contribute to the test metrics.

Treatment

The group of customers that receive your new skill experience when your test is running.

P-Value

The probability of seeing a particular result (or more extreme) from zero, assuming that the null hypothesis is TRUE.

User Count

The number of users included in the skill version you're testing.

Percent Diff

The relative percent difference between the mean of T1 group and the C group.

Confidence Interval

A way of presenting the uncertainty associated with a given measurement of a parameter of interest.

Metrics

You can designate any of the following metrics as either a key metric or a guardrail metric.

You should select one to three key metrics which track changes in your customer behavior, as they relate to your hypothesis. For example, if your hypothesis states that you might increase customer subscriptions by changing the location of your ISP upsell messaging, than you might select the following as key metrics: ISP : Offer Accept Rate and ISP : OPS.

For an example of how to use these metrics in a test, see Set up an Endpoint-based A/B test.

Metric Name Description

Customer Friction

A metric calculated from various customer interaction patterns and other contextual signals to predict if a customer perceived friction or not.

ISP : OPS

The amount of revenue generated from ISP sales.

ISP : Sales

The number of sales (quantity) generated from ISP offers.

ISP : Offer Accept Rate

The total number of offers accepted divided by the total number of offers delivered. Note that accepted offers are counted prior to the payment being complete and successful.

ISP : Offer to Purchase Conversion

The total number of ISP purchases completed divided by the total number of offers delivered.

Skill Next Day Retention

How often a customer uses your skill in a day or a set of consecutive days.

Skill Utterances

Tracks customer sessions with skills. This metric maps to one dialog per session with a skill, regardless of the number of interactions (multi turns) within a session.

Skill Active Days

The active number of days a customer is using your skill. This number is calculated from your skill's Dialog data. An active day is counted if a customer has at least one dialog on that day.


Was this page helpful?

Last updated: Oct 13, 2023