Harry Potter and the hidden variables.
The added value and competitive advantage are in the insights. You must look for the hidden variables.
The phrase from The Economist (2017) "Data is the new Oil" has totally changed the economy (perhaps today we could say that it is the "New gas", but the idea is not to go through the current economic and political situation). However, this is not entirely true. Oil is a scarce commodity and data is an abundant commodity. And being an abundant good, it brings a number of problems to analysts, companies and organizations. One of them is being able to differentiate which data generates noise and which data generates signals. To being able to understand what data generates signals is what I have dedicated myself to for the last +15 years.
The advanced data analytics industry is highly dynamic and has many roles. In fact, the data value chain is as complex as any other product value chain. And as we have said, as it is an abundant good there are several risks when working with it. I will leave the legal, reputational risks, etc. aside, and I am going to focus on the risks involved when doing advanced analytics, using complex models. The main risk is not creating value with the data. And although it seems difficult to understand, in many cases, thanks to data science, value is not created. And if it is created, that value is not appropriated, much less distributed (the theory of value is important to work well with data, and data-based products).
The secret is being able to create Insights. But what are Insights? It's easy, it's being able to find patterns that aren't obvious. This is what makes us different in the industry and gives us a temporary competitive advantage. And, therefore, we create, appropriate, and distribute value that we did not do before.
An Insight is what Netflix found. Something that no one else had realized until that moment. Something on which the famous Netflix recommendation algorithm is based. Through advanced data analysis they found that our brain generates Dopamine when we are facing a new series (see Young, J. et al, 2021 here). The excitement of the premiere.
And since dopamine gives us that need to have a reward as quickly as possible, they defined the launch day of the series to be successful. And if I want to see a whole season of 8 episodes as soon as possible, is any day the same for the release? What day would you recommend? (The clue is in the photo below. Yes, Fridays)
But if I create a competitive advantage based on data, what is there going to happen? Just look at the photo below. Yes, they will copy us.
But when we talk about behavior and data, the two things have several factors in common, and one of them is dynamism. Because we behave differently, because we are changing our behavior, and because that is evidenced in the data. Therefore, Disney+ took the lead in the insight analysis and found that our behavior had changed. That the day had changed. That we had more time in the week to watch series. And if we know that the day changed, because I found it in the data, what is the decision to be made? Exactly, change the day.
Recommended by LinkedIn
And if it goes well, the dynamics of the industry changes. And if I innovated with courage, they would copy me. And as a second mover, Netflix now followed Disney+
Therefore, when we see these graphs, we do not have to be amazed, honestly there is a lot of advanced data analysis, and behavioral theory (psychology, anthropology, sociology, and other social sciences explaining it) behind these results.
But the power to develop Insights is not only exclusive to streaming platforms. It can be found in all industries and sectors. As another example I leave, UPS, where they found an Insight that saved them +500 million dollars. The Insight was: “dont turn left”. Why not? Because in the US you can turn right on a red light. Therefore, there is no lost time, there are fuel savings, increased efficiency, and increased productivity.
But this is not easy. Finding these Insights is not easy. That is why when they are found, competitive advantage is automatically generated for the company or business that found them. What is the reason why we find so few Insights? The answer is quite simple, because of the unconscious biases that we have.
I show you two examples. The first is that many times we use data science to confirm what we already know. That is to say, we continue along the path of experience or of the simple. Many times, data scientists are more concerned with making an algorithm as complex as possible, using the latest technology, or the latest R or Python library, and fall into confirmation bias. That is, confirm what they knew, the company knew, the C- level knew, the industry knew. And when I ask about this type of situation in different conferences and meetings they would reply "but now I have the data that confirms it" . But that seems more like an academic paper of empirical research than generating value and innovating based on data in the industry.
The second point, which I see a lot with economic and statistical profiles, is the lack of will to dig into a spurious relationship (or, sometimes, spurious correlation). I regularly hear colleagues who work with data mention “it's correlation, but not causation”, or “that correlation is spurious, it doesn't work, you're looking at things the wrong way. It does not work. Wrong". And the truth is that with this way of thinking, in data analysis, a lot of money is left on the table.
Why is so much economic value lost by ruling out a spurious correlation? Because, given my experience, the Insight is in the hidden variables that explain that spurious correlation. Spurious correlation is a mathematical relationship in which two events have no logical connection, although it can be implied that they do due to a third factor not yet considered (called a "confounder" or "hidden variable"). As an example, the correlation between turning left and reducing consumption (that was all the data that UPS had in its databases) does not make sense. It is a spurious correlation. But what generates the insight is the traffic light data. It is a variable out of sight, hidden, that ends up giving a lot of value to the business. In the case of Netflix, the hidden variable (which was not a priori within the company's data set), is the generation of Dopamine in consumers. The excitement of the premiere.
In short, use data science to find Insights. It is what creates value. It is what creates competitive advantage. It is what makes us market leaders. Be careful with confirmation biases, although it is good to confirm our hypotheses, many times this does not furnish relevant value to the industry. And finally, always look for hidden variables. Obsess over spurious correlations and hidden variables. For that, you must combine data science with several other disciplines related to behavioral science.
Data Scientist | MSc | MBA | Electronics Engineer
2yInsights is one of those concepts that, in my experience, so many talk about and yet few know what they truly are. For instance, I've seen that C-level executives can fall into the trap of simply not "believing" the insight. This seems to happen when what you discover is either too complicated, or the executive does not have the necessary knowledge to grasp it. Sure, there is another reason, a pretty common one too, when they say that "if what you say is true, then everybody would be doing it". I think I get it. It's not easy in general to go after every pot of gold; it takes courage in addition to knowledge and experience. That is probably what would differentiate the innovators and early adopters from the late majority. Interesting article! 😄