The recently published “Final Benchmark Study” by Optimizely, based on the analysis of over 127,000 experiments, provides invaluable insights to inform your A/B testing and experimentation program.
These are some interesting insights from the study:
💡 88% of tests don’t win (This is why it’s SO important to test – Our intuitions about what will succeed are often not correct)
💡 Only a third of experiments test more than one variation, but experiments that have more variations are 3x as impactful (i.e., we should do more ABCD tests when possible)
💡 Tests that make significant changes to the user experience (pricing, discounts, checkout flow, data collection, etc.) are more likely to win and with higher uplifts.
💡 Experiments that include targeting are 16% more likely to win when compared to untargeted experiments.
💡 The median company runs 34 experiments per year. The top 3% of companies run over 500. To be in the top 10%, you need to be running 200 experiments annually.
Other key findings
Experimentation Win Rates and Company Practices
-
About 12% of experiments win on their primary metric, while 88% do not.
-
The median company runs 34 experiments per year, with the top 3% conducting over 500 annually.
-
Companies are increasing their experimentation velocity by 20% year over year.
-
Most experiment uplifts decrease to 80% of their initial value after a year, except for revenue-related uplifts, which retain 91%.
Experimentation Evolution and Strategies
-
Companies are transitioning from client-side testing to more mature experimentation frameworks, with feature experimentation growing to 36% of all tests since 2016.
-
Experiments involving more complex changes and multiple variations are more successful.
-
Advanced analytics and integrated Customer Data Platforms (CDPs) significantly enhance experimentation success.
Industry and Metric Variations
-
Win rates and experiment success vary across industries, influenced by experimentation maturity and metric selection.
-
The choice of primary metrics for experiments differs by industry, reflecting varying goals and priorities.
Team Performance and Experiment Design
-
Experimentation teams tend to maintain consistent performance over three years. Improvement requires altering research, creativity, and development processes.
-
High-impact experiments often involve substantial changes and multiple variations.
-
Greater complexity in experiments, such as multiple change types, leads to higher returns.
Micro-Conversion and Personalization
-
Focusing on micro-conversions (like search rate and add-to-cart rate) can lead to a higher experiment impact than solely targeting revenue.
-
Personalized experiments targeting specific user segments are 41% more impactful than general ones.
Resource Allocation and Traffic Models
-
Effective resource allocation, including developer time, is crucial. The most productive setup is running one experiment per developer per two-week sprint.
-
Machine-learning models like Stats Accelerator and Multi-Armed Bandit, which dynamically allocate traffic, significantly enhance experiment outcomes compared to standard A/B tests.
Successful experimentation in digital commerce hinges on advanced analytics, complex experiment designs, focus on micro-conversions, personalization, and efficient resource allocation. These insights can guide executives to foster a culture of innovation and optimize their digital strategies effectively.
Check out the report
Dive in to start reading The Evolution of Experimentation research from Optimizely.