What Murder and Ice-cream have to do with Data-driven Product Management?
The answer is: Confounders.
Research has shown that there is a spurious relationship between murder rates and ice-cream sales.(1)
This spurious relationship does not mean that one causes the other. What is happening then? It is the warm weather that causes people to buy more ice-cream, and be more violent - the "warm weather" is our confounder. (2)
How this relates to data-driven product management?
Your data-driven software product depends on the ability of making good use of insights and features that are inferred from raw data (obviously there are other factors such as design, etc). The more relevant confounders are identified and integrated, the better your insights. Hence, the better your product feature and obviously customer happiness and business performance. This happens when the product management team has expertise on variable selection (independent, dependent, and confounding variables), combined with domain knowledge.
In the journey of becoming data-driven PM, first steps are hard but in the long term it becomes easier. Here is how data-driven Product Management could start (this does not include business aspects):
- Clarify the question or market gap that the product will address or fill.
- Start with product ideation based on data rather than the visual aspects of the product. (This is very important although jumping to PowerPoint and drawing how the product would look, together with a roadmap, is very tempting because it makes it so much easier with the business leadership).
- Consider sources of data, ownership, contractual aspects, privacy and GDPR.
- Approach the corresponding teams: in addition to the usual ones such as Software Development, also approach DevOps, Data Science, Data Engineering, Legal team, etc. and inform them about the new product.
- Ask for technical details about data fields, freshness of data, external and internal data integration capabilities, pipelines, availability, data collection, ownership from legal aspects, ownership from technical aspects.
- Consult internal domain experts and ask what other confounders could influence your product insights.
- Always cross-check with an external domain expert without providing internal data related product details. Ask for variables that have any relationship with the insights of your potential product. This will help because the external expert will not be biased by the internal company information of any nature.
- Cross-check the new information with the Data Science and Data Engineering teams. Ask what is possible.
- Create the product / business workflow of data all the way to the insights.
- Ask for software, infrastructure, and data architecture design diagrams. This will enable the Product Management to create a holistic picture of various dependencies and act proactively.
- Clarify data integration and collection options - especially if a Data Lake is available.
- Have UX / UI experts do their part on the product based on the workflows and data available now. Having the data available and being data-driven, the work of UX/UI becomes so much easier.
- Plan and design for the data that will be available later once the product is rolled out and new data is collected. This will help your new product features avoid eventual cold-start issues.
- Further steps are standard for most PM teams.
If it is possible and you are prepared, consult with the end-user and incorporate the feedback, especially when the product starts to have the design in place.
I believe that the most common approach today, but luckily fading, is to start a product from the UI. However, in my experience, the success rate of such products is rather low. This is not necessarily bad in the grand scheme of things - think of learning by failing, fast iterations, embrace failure, etc. There are businesses that can tolerate iterating through many PoCs/MVPs until, eventually, few of them succeed into products. However, affording high software product experimentation failure rates in the name of learning, seems to fade away. Ideally, the ratio of success/trial must be as close as possible to value of 1 (Ex. out of 10 product ideas developed into PoCs, 5 have rolled up to production and are successful: success/trial ratio is equal to 0.5).
Suggestion: As a company, log all product initiatives at PoC level and see how many are ending up being rolled out into the market (with accompanying continuous success metrics). The acceptable ratio must be defined by the business. The rule of thumb, that I think makes sense, is that the cost of all trials must not pass 20% of workforce expenses. One of main benefits of this approach is that the Product Management will have it very easy to present data-driven success results, and request more budget.
There is a better chance for a product to be successful if the product is developed incrementally; heavily based on data, domain expertise, user feedback, and science. Because the costs can be controlled better, companies are interested in products that are created based on various aspects of data: owned raw data, market data, customer data, business data, to-be collected data, third-party data, etc.
Let's get back to the core of this article, confounders. Here is an example.
Imagine that your company developed a piece of software that can recommend bonuses for employees. Your software looks at pay rate, previous performance, gender, job tenure (any role). Your software uses Machine Learning to recommend bonuses (using Collaborative Filtering). Here is the twist. The Machine Learning model is biased because the company's culture is built on gender discrimination, it favours higher paid employees, and it favours longer job tenure. This means that the recommender will suggest the highest bonus to the male employee who is in the highest position, longest with the company, and who was performing great when he joined the company but not necessarily now. The recommender will totally miss a highly productive middle-manager business woman that has been with the company for a few years and has had great performance since.
Your software uses Machine Learning and is data-driven...or is it?
There are a lot of confounders that influence these recommendations. Consider time being a confounder. If the ML model integrated the weighted time dimension in function of job tenure and previous performance, the recommendations would have been better. Note that time has its relationship aspects to it. Longer job tenure does not necessarily mean better performance but the Machine Learning model might become biased based on the company habits to reward older tenure employees (see the Human capital theory below). The older the performance evaluation the less relevant with today's bonus and this needs to be reflected in your model parameters, perhaps as a decay parameter.
The following quote illustrates the different approaches that organisations might adopt.
Human capital theory suggests that as knowledge and skill increase with greater tenure, job performance will improve as well. In contrast, the literature on job design suggests that as job tenure increases, employees are likely to become more bored and less motivated at work. (3)
It is important for Product Management to consider as many data variables as possible before putting effort into roadmaps, workflows, features, and design. Even though confounders are often hard to identify and integrate, most probably they will give your product the secret sauce that is so much sought after.
As food for thoughts, here are few questions. What would happen if we could control the warm weather confounder, keeping it at zero throughout time? Can you eliminate or control confounders that complicate your product insights or devalue your product?