Statistical significance of a test indicates whether the results of the test are "real" or whether they are simply due to chance occurrence. The statistical significance gives us a measure of the reliability of the test results.
In any experiment that involves drawing a sample from a population, there is always the possibility that the observed effect (result of the test) may have occurred simply due to a chance occurrence or a sampling error. To avoid this uncertainty and to ensure that the results of an experiment reflect the actual choices of the overall population, a term known as Statistical Significance is used.
The result of a test is considered to be statistically significant if the probability that the result could have occurred by chance is lower than a pre-defined threshold.
If we denote this probability as p and the pre-defined threshold as ɑ (alpha), then:
Statistically significant result = Probability (p) < Threshold (ɑ)
Statistical Significance in A/B tests
An A/B test is an example of statistical hypothesis testing, a process whereby a hypothesis is made about the relationship between two data sets and those data sets are then compared against each other to determine if there is a statistically significant relationship or not.
To put this in more practical terms, a prediction is made that content variant B will perform better than content variant A, and then data from both the content variants are observed and compared to determine if B is a statistically significant improvement over A.
For example, we have no way of knowing with 100% accuracy how the next 100,000 people who visit our channel will behave. This is the information that we do not have today, and if we were to wait o until those 100,000 people visited our site, it would be too late to optimize their experience. What we can do is observe the next 1,000 people who visit our site and then use statistical analysis to predict how the following 99,000 will behave.
The complexities arrive in all the ways a given “sample” can inaccurately represent the overall “population”, and all the things we have to do to ensure that our sample can accurately represent the population.
Statistical Significance in A/B tests in Acoustic Personalization
In the context of Acoustic Personalization, you must specify the statistical significance for an A/B test as a percentage that indicates your confidence that the results of the A/B test are valid and free from errors caused by randomness. For example, if you set a statistical significance level of 95%, it means that you can be 95% confident that the observed results are real and not caused due to chance occurrences.
Statistical Significance is based on the number of impressions and conversions for Control group and the Variants.
Statistical significance value is calculated based on the click rate of visitors on the channel. To calculate the statistical significance, the control group and at least one of content variants must have non-zero click rate.
The Statistical Significance value for an A/B test in Acoustic Personalization should be within the range 50% to 100%. By default, the value is set to 90%.
It is not advisable to set the value below 90%, because the lower the threshold for statistical significance, the less likely it is that the improvement in the conversions (or whatever the Goal is) is due to given variant being shown. Similarly, it is not advisable to set the statistical significance value to 100%; as this value is practically unlikely to be met during the test.
For example, consider an A/B test in which:
- Control group has 15% conversion rate
- Variant 1 has 30% conversion rate
- Variant 2 has 35% conversion rate
If our goal was to increase the conversion rate, then Variant 2 is the best performing variant. Measured against Control group, the different is of 20% and it would be sufficient to meet the statistical significance if we had set it to 85%. However, the same difference may not have been sufficient if we had set the statistical significance to 99%.
Statistical significance of an in-progress A/B test
For an in-progress A/B test, you may see a message on the Performance details page, if the following conditions are fulfilled:
- The test is in progress.
- The statistical significance reached by the test is less than 90.
- The test has run for less than a week.
The statistical significance result shown for an A/B test which is still in progress may not reflect the real-world scenario and hence is not very reliable. For a more reliable result of statistical significance, it is recommended to wait for about a week from the A/B test start date.
Choosing the winning content
The winning content is decided only when the A/B test reaches its end date.