The two purposes of babbab
are:
- To be the simplest tool for Data Analysts/Statisticians to analyze A/B tests.
- To return the simplest results for Stakeholders/Non-Statisticians to understand.
babbab
an acronym of BAyesian Beta-Binomial A/B testing (BaBBAB
), but it's spelled in lowercase (babbab
) because it doesn't like shouting.
This should work in vanilla Python +3.8.
pip install babbab
Lets assume we are testing changing the background color of our app from grey to green. Lets say we sell subscriptions to a paper magazine. We want to know if changing the background color will increase sales. To do so, we assign 50% of our users to the new app design with a green background (The Variant Group), while other 50% stay in the old grey design (the Control group). We managed to pull these 4 numbers out our tracking into Python:
control_sold_subscriptions = 200
control_users = 40316
variant_sold_subscriptions = 250
variant_users = 40567
Because babbab
is awesome you can just run:
import babbab as bab
plot, statement, trace = bab.quick_analysis(control_sold_subscriptions,
control_users,
variant_sold_subscriptions,
variant_users)
And get everything you need.
- In
plot
you will find a matplotlib figure. You can change the title and labels in thequick_analysis
function. - In
statement
, you will get a string that is intended to be interpreted verbatim by Non-Statisticians. - In
trace
, you will get an arviz InferenceData object, in case you want to explore the run further.
In the signature of quick_analysis
you can configure the statistics and the aesthetics of most of this.
A/B tests (or controlled experiments) are an increasingly popular way of incrementally improving websites, desktop, and mobile apps. At Multilayer we have analyzed probably hundreds, with a miriad of different tools and statistical methodologies.
In our experience, when companies A/B tests, the biggest problems they encounter are around interpreting the results and acting appropiately on them. There are plenty of statistical libraries out there that do A/B testing right (babbab actually uses PyMC in the background). However, sharing statistics (like p-values) with non-statisticians can lead to confusion and misuse of results.
What babbab
tries to cover is the "last mile" of the A/B test analysis: Interpreting and communicating the results for them to be actionable.
- Get 4 numbers in, get a statistically valid statement that you can repeat to your manager verbatim, and a plot you can understand.
- Get 4 numbers in + some labels, and you will get the above and a plot you can share and a statement you can C&P in the company chat.
- Add a bit more work, and you have your own custom built AB testing dashboard/tool.
Stop worrying about your peers and yourself misinterpreting stats.
Still a lot to basic docs to do.
- Add example results (plot, statement) to the README
- Add example with labels to README
- Add docstrings
Maybe?
- Sphinx or RTD Documentation