What is the Confidence Meter?

Marpipe is the only automated ad testing platform with a live statistical significance calculator build right in. We call it the Confidence Meter, and it tells you if your testing data for each variant group is scientifically proven — or not.

It appears in the Intelligence section and looks like this:

Each multivariate test run on Marpipe has multiple variant groups being tested — some reach high confidence sooner than others. When a variant group reaches high confidence, it means you have enough data to make creative decisions. And when enough variant groups reach high confidence, you can move onto your next test.

The Confidence Meter can tell you several things, like:

  • whether or not a variant group has reached high confidence

  • if further testing for a certain variant group is necessary

  • whether repeating the test again would result in a similar distribution of data

  • when you have enough information to move onto your next test

  • how to move forward with each specific group of creative elements

The Confidence Meter will NOT tell you:

  • one variant or variant group is the all-time best (or worst)

  • a variant group will always (or never) impact your KPIs

  • there’s no need to challenge winners in subsequent tests

How to read the Confidence Meter

Gray bar means:

  • 0–55% confidence; fluctuations in performance are likely due to chance

  • further testing for this variant group is necessary

  • you do not have enough information to move onto your next test

  • try testing variants with more substantial differences between them

Yellow bar means:

  • 56–79% confidence; fluctuations in performance might be due to chance

  • further testing for this variant group is necessary

  • you do not have enough information to move onto your next test

  • try looking at another KPI or continue to put spend behind this test to reach high confidence

Green bar means:

  • 80–100% confidence; fluctuations in performance are not due to chance

  • further testing for this variant group is not necessary

  • if enough variant groups are green, you have enough information to move onto your next test

  • continue to challenge winning elements and drop low performers in future tests

Our Methodology

What underlying statistical methods do we use?

Marpipe uses the G-test (also known as the likelihood ratio test). It's a statistical test that determines if the proportions of categories in two or more group variables significantly differ from each other. It has been the standard for decades in science and mathematics as a test for significance.

Why don't we do multiple analysis corrections?

Marpipe lets you break down and analyze your results in a nearly infinite number of ways — or just one.

If you accept the results of multiple analysis breakdowns, from a statistics point of view you are more likely to think that there is no meaningful result when, in fact, there is one.

Because of this, we highly suggest customers decide on a primary hypothesis prior to running a test. And when looking at results across tests, we also suggest creating a new test to validate any specific patterns that seem to emerge.

Did this answer your question?