Build With Me: The A/B Testing Engine
Proving ROI with Stats
The marketing team changed the color of the 'Buy Now' button from Blue (Variant A) to Green (Variant B). Variant B got 5 more clicks today. The designer wants to deploy it instantly. As the Data Scientist, you need to prove if this is a real improvement, or just random noise. Let's code the math.
Step 1: Define the Experiment
First, we input the raw data from our web analytics platform.
# Experiment Setup
TRAFFIC_PER_VARIANT = 1000
CONVERSIONS_A = 120 # Blue Button
CONVERSIONS_B = 125 # Green Button
# Calculate basic conversion rates
cr_a = CONVERSIONS_A / TRAFFIC_PER_VARIANT
cr_b = CONVERSIONS_B / TRAFFIC_PER_VARIANT
print(f"Blue CR: {cr_a:.2%} | Green CR: {cr_b:.2%}")Your Turn: Hit Run Experiment. The Visualizer will generate two bell curves. Look at the Verdict Box. Is a 5-conversion difference statistically significant at 1000 visitors? No. It's just noise.
Step 2: The Law of Large Numbers
What if the exact same conversion rates happened, but we had way more traffic? Change your code so that TRAFFIC_PER_VARIANT = 10000, CONVERSIONS_A = 1200, and CONVERSIONS_B = 1250.
Your Turn: Run the experiment again. Look at the visualizer. Notice how the bell curves got incredibly narrow and pulled apart? By increasing our sample size, we eliminated the uncertainty. The verdict is now Statistically Significant!
Step 3: Calculating P-Value in Python
Under the hood, we use scipy to calculate this exact overlapping area (the P-Value).
import scipy.stats as stats
# In a real environment, you use scipy's proportions_ztest
# A p-value under 0.05 means we are 95% confident the result is real.
# p_value = stats.norm.sf(z_score)
print("Never trust raw numbers without calculating significance!")Variant A (Control)
Variant B (Treatment)
Knowledge Check
Ready to test your understanding of Build With Me: The A/B Testing Engine?