The Concept of EVI: Expected Value of Information

Ron Kohavi

Vice President and Technical Fellow | Data Science, Engineering | AI, Machine Learning, Controlled Experiments | Ex-Airbnb, Ex-Microsoft, Ex-Amazon

Published Jan 13, 2020

If you have a lot of uncertainty now, you don’t need much data to reduce uncertainty significantly…if you know almost nothing, almost anything will tell you something.

-- Douglas Hubbard in How to Measure Anything (p. 162)

There is a tremendously useful concept, Expected Value of Information (EVI), that I’d like to summarize.

Assume that you’re faced with a set of choices. In product development, we typically have a long backlog of ideas, and we need to choose a few projects to implement in our next development cycle. The prioritization may be done by the program manager or the group manager, or by some crowdsourcing of votes by several trusted experts.

Given the poor success rate of ideas that we implement, where only about 10-35% of controlled experiments improve the metrics they were designed to improve (Online Controlled Experiments at Large Scale, Tenet 3), the question is what can we do to reduce the development costs?

To be clear up front, I’m not going to claim I have a magic solution for improving that 10-35% rate---I don’t. Significant changes to that percentage are unlikely. What I am going to show is something simple: have smaller initial discovery projects that cost less, and they’ll enable you quickly hone on the more promising projects. If you fail quickly on smaller projects, you’ll have more resources to invest in the ones showing success.

Let’s assume that we know what we’re optimizing for, and we have a clear Overall Evaluation Criterion (OEC). If there is no agreed-upon OEC, then we should step back and define one (or a few key metrics); otherwise, we can declare success now. Note that the OEC can be quantitative data (e.g., user actions like purchases and time to those actions), survey data (are users satisfied), and expert opinion (e.g., a “coherency” score for the design of the user flow).

Now further assume that we have a mean and a 95% confidence interval for the OEC of each project, say by crowdsourcing votes of our trusted experts and merging them. Every project has uncertainty about its outcome: some projects may have a narrow confidence interval because we know they previously worked (e.g., most ideas that worked in the US have a high probability of working in the UK, so our confidence interval may be narrow), others have more uncertainty represented by a wide confidence interval where the delta OEC (treatment effect) crosses zero (so we do not know if it’s going to be positive or negative for the OEC).

The usual (simplified) process is to prioritize the projects based on the mean of the confidence interval and choose to implement those at the top, based on our available resources.

This is where the concept of EVI comes in. Could we update the mean and confidence interval by getting additional information? What would be the expected cost of that information (ECI)?

Here are some examples of ways to reduce the uncertainty:

Expert opinion. Ask some experts in the domain. If they know more than you do, there’s a good chance you can reduce the uncertainty.
Get the relevant data. Many times, a few queries can help assess the coverage of the feature (how many users would see it) and provide upper bounds on the utility. Other types of data may be prior experiments that were run. The name “Discovery project” is sometimes used to identify this step.
Conduct user studies and market research. Review sketches, mockups, and prototypes of the idea. Conduct survey and usability studies.
Run a simple/cheap controlled experiment. Some of the biggest breakthroughs resulted from a trivial experiment that took days to implement, including for example:

Long ad titles at Bing, an experiment that increased revenue by over $100M took days to implement (see the opening example in the Surprising Power of Online Experiments and in Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing).
Optimizing colors, which was worth about $200M for Google (Quora).
Amazon’s credit card offer, worth tens of millions of dollars (Seven Rules of Thumb).
Opening links in new tabs, which dramatically improved click-throughs (Seven Rules of Thumb).

The concept of Minimum Viable Product (MVP) championed by Eric Ries is clearly aligned with the idea of lowering the initial cost to understand the value of the feature.

Doug Hubbard’s wonderful book, How to Measure Anything, covers this idea of quantifying the value of information in Chapter 7. Here is the key diagram:

The x-axis ranges from no information to perfect certainty. The y-axis ranges from zero value to perfect information for the EVI curve, and from zero cost to high cost for the ECI curve.

Next time you are faced with the need to estimate the value of a feature or project, or a choice, think about whether you can invest in a quick discovery project that has small cost—low on the ECI curve-and has high value in reducing uncertainty—high on the EVI curve.

-- Ronny Kohavi

Ann-Marie Trost, Sr. Director Level in MedTech Women

Director of Product and Innovation | Med Tech Cross-Functional Leader | Leading Steering Committees for AI | Leading Modernization Initiatives to Improve Cx, Px and Ex

11mo

This is helpful! Thanks for creating this post.

1 Reaction

Asaf Radai אסף רדאעי

עוזר לישראלים בחו"ל לשמור על הכסף הפנסיוני שנשאר בישראל

Ronny, thanks for sharing!

Michael Sussman

Lead Data Scientist at Apple

Nice write-up! This is also a great way to talk about negative experiments. For example, let's say you want to improve your site's latency, but don't know how many engineers to throw at the problem - how much revenue is a 100 ms improvement actually worth? Running a negative experiment is a good solution, intentionally giving treatment users an experience with increased latency and measuring the revenue impact, but senior leadership is likely not going to be happy with an experiment that "intentionally loses money." Rephrasing the conversation in terms of "paying for information" is a much more palatable concept.

9 Reactions

Yael Garten

Data Science & Engineering Executive (formerly Apple, LinkedIn) | Board Member

Great post, Ronny!

1 Reaction

Natalie Zayats

Director, Data Science

Simple and practical concept! Thanks Ronny!

1 Reaction

See more comments

To view or add a comment, sign in

See all

The Concept of EVI: Expected Value of Information

Ron Kohavi

Vice President and Technical Fellow | Data Science, Engineering | AI, Machine Learning, Controlled Experiments | Ex-Airbnb, Ex-Microsoft, Ex-Amazon

More articles by this author

Insights from the community

Others also viewed

Defining the Ecosystem Domain: Ecosystems, Arenas and Jobs-to-be-done

The Missing Layer in the Product-Market Fit Pyramid

February "Are We There Yet?" - The Discovery of Ignorance, The Limitations of Data, and The Challenges of Large Scale Transformations

User-Centric Observability and Green Technology: The Formula for Success

KPIs for Lab Success + Tools to Generate and Monitor Them

Trailblazers in Tech: Discover the 5 Forefront Innovators in Software and Data Engineering

Understanding Relations in OPM — Part 2: Procedural Links

Validations vs Solutions - Same But Different

Governing the complexity of contemporary IT systems for heading to a self-drive paradigm

The Cynefin Framework

Explore topics

Goodhart’s Law with Examples

Aug 13, 2024

The QA Tradeoff in A/B Testing

Feb 15, 2024

Should you suggest or enforce a template for hypotheses in A/B tests?

Feb 6, 2024

When should you use quasi-experiments instead of controlled experiments, or A/B tests? The barometer question analogy

Jan 20, 2024

How to set alpha when you have underpowered experiments?

Nov 27, 2023

The Cost of False Positive A/B Tests

Nov 25, 2023

Does offline accuracy of machine learning models predict performance in A/B tests?

Nov 15, 2023

Why 5% should be the upper bound of your MDE in A/B tests

Nov 6, 2023

Multi-Armed Bandits, Thompson Sampling, or A/B Testing? Are you optimizing for short-term headlines or long-term pills worth billions?

Jun 17, 2023

My (Biased) Review of Reforge’s Experimentation + Testing Class

May 3, 2023

Insights from the community

Others also viewed

Defining the Ecosystem Domain: Ecosystems, Arenas and Jobs-to-be-done

The Missing Layer in the Product-Market Fit Pyramid

February "Are We There Yet?" - The Discovery of Ignorance, The Limitations of Data, and The Challenges of Large Scale Transformations

User-Centric Observability and Green Technology: The Formula for Success

KPIs for Lab Success + Tools to Generate and Monitor Them

Trailblazers in Tech: Discover the 5 Forefront Innovators in Software and Data Engineering

Understanding Relations in OPM — Part 2: Procedural Links

Validations vs Solutions - Same But Different

Governing the complexity of contemporary IT systems for heading to a self-drive paradigm

The Cynefin Framework

Explore topics