Analytics: Lessons Learned from Winston Churchill
Analytics has been around for quite some time now. Even during World War II, it proved critical for the Allied victory. Some famous examples of allied analytical activities include the decoding of the enigma code, which effectively removed the danger of submarine warfare, and the 3D reconstruction of 2D images shot by gunless Spitfires, which helped Intelligence at RAF Medmenham eliminate the danger of the V1 and V2 and support operation Overlord. Many of the analytical lessons learned at that time are now more relevant than ever, in particular those provided by one of the great victors of WWII, then Prime Minister, Sir Winston Churchill.
The phrase “I only believe in statistics that I doctored myself” is often attributed to him. However, while its wit is certainly typical of the Greatest Briton, it was probably a Nazi Propaganda invention. Even so, can Churchill still teach us something about statistical analyses and Analytics?
A good analytical model should satisfy several requirements depending upon the application area and follow a certain process. The CRISP-DM, a leading methodology to conduct data-driven analysis, proposes a structured approach: understand the business, understand the data, prepare the data, design a model, evaluate it, and deploy the solution. The wisdom of the 1953 Nobel Prize for literature can help us better understand this process.
Have an actionable approach: aim at solving a real business issue
Any analytics project should start with a business problem, and then provide a solution. Indeed, Analytics is not a purely technical, statistical or computational exercise, since any analytical model needs to be actionable. For example, a model can allow us to predict future problems like credit card fraud or customer churn rate. Because managers are decision-makers, as are politicians, they need “the ability to foretell what is going to happen tomorrow, next week, next month, and next year... And to have the ability afterwards to explain why it didn't happen.” In other words, even when the model fails to predict what really happened, its ability to explain the process in an intelligible way is still crucial.
In order to be relevant for businesses, the parties concerned need first to define and qualify a problem before analysis can effectively find a solution. For example, trying to predict what will happen in 10 years or more makes little sense from a practical, day-to-day business perspective: “It is a mistake to look too far ahead. Only one link in the chain of destiny can be handled at a time.” Understandably, many analytical models in use in the industry have prediction horizons spanning no further than 2-3 years.
Understand the data you have at your disposal
There is a fairly large gap between data and comprehension. Churchill went so far as to argue that “true genius resides in the capacity for evaluation of uncertain, hazardous, and conflicting information.” Indeed, Big Data is complex and is not a quick-fix solution for most business problems. In fact, it takes time to work through and the big picture might even seem less clear at first. It is the role of the Business Analytics expert to really understand the data and know what sources and variables to select.
Prepare the data
Once a complete overview of the available data has been drafted, the analyst will start preparing the tables for modelling by consolidating different sources, selecting the relevant variables and cleaning the data sets. This is usually a very time-consuming and tedious task, but needs to be done: “If you're going through hell, keep going.”
Never forget to consider as much past historical information as you can. Typically, when trying to predict future events, using past transactional data is very relevant as most of the predictive power comes from this type of information. “The longer you can look back, the farther you can look forward.”
Modelling: be parsimonious!
Analytical models should also be easily interpretable as well as statistically effective.
The model should have good statistical significance and predictive power: “It is a fine thing to be honest, but it is also very important to be right.” How this can be measured will depend upon the type of Analytics considered. For example, in a classification setting (e.g. churn or fraud), the model should have good discrimination power. In a clustering setting, the clusters should be as homogenous as possible.
But, if one focuses too much on statistical accuracy, the risk is obtaining a black box solution, which is usually not well received in most business settings. Popular examples of this are neural networks which are universal approximators. They are high performing, but offer no insight into the underlying data patterns. In contrast, linear regression models are very transparent and comprehensible, but offer only limited modelling power.
Interpretability refers to the facility with which the analytical model, and the patterns it captures, can be understood. Indeed, these models are meant to be used, so they should not be too complex for the average business user. For example, in credit risk modeling or medical diagnosis, interpretable models are absolutely needed in order to offer good insight into the underlying data patterns. “All great things are simple, and many can be expressed in single words.”
To obtain comprehensibility, a model needs to be parsimonious. It needs to find the right balance between performance (i.e. its capacity to capture complex processes) and interpretability. This could be seen as the expert’s ability to provide relevant content in a synthetic way. You do not want to find yourself in a situation where “the length of this document defends it well against the risk of its being read.”
Evaluate the model
“However beautiful the strategy, you should occasionally look at the results.” In other word, analytical models should also be economically and operationally efficient.
To attain economic efficiency, the Business Analytics expert needs to take into account the costs incurred by the analysis. This includes the costs of gathering and pre-processing the data, the costs of analyzing it and putting the analytical models into production. Software costs and drain on human and computing resources should also to be taken into account. It is important to do a thorough cost-benefit analysis at the start of the project.
Operational efficiency refers to the efforts needed to collect the data, pre-process it, evaluate the model and feed its outputs in business application (e.g. campaign management, capital calculation, etc.) In real-time, on-line scoring environments (e.g. fraud detection), this may be especially crucial. In addition, operational efficiency also refers to the efforts needed to monitor and backtest the model, and re-estimate it when necessary.
A continuous improvement process
Indeed, analytics models are by nature in constant evolution. “To improve is to change; to be perfect is to have changed often.” Very often, sometimes even before the first release, the Business Analytics expert will realize that some adjustments or fine-tuning are needed. This can be due to the fact that the business process itself has changed, and so the model needs to be adapted; or it can be due to the due to the fact that the model is simply underperforming and needs to be fine-tuned: “Success is the ability to go from one failure to another with no loss of enthusiasm.”
Deployment
Finally, after intense effort and thorough analysis, your recommendations can be put into practice. Your strategy and actions will no longer be driven by gut-feeling or vague concepts but will be fact and data-driven. As Sir Winston Churchill himself did, you will “pass with relief from the tossing sea of Cause and Theory to the firm ground of Result and Fact.”
Analytics: blood, toils, tears and sweat?
Steering an analytics project to success requires you follow a certain process: qualify the business problem to ensure that your approach will produce actionable results, understand and prepare the data well, and build a parsimonious model with good performance that you can easily deploy and continuously improve. To do so, you need your analysis to be actionable, statistically effective and yet interpretable, as well as economically and operationally efficient.
If this process seems complex, who can help you steer it towards success? Business Analytics experts balance an in-depth understanding of statistics, IT and programming with business insights and communication skills. These very well trained individuals are at the centre of the Business Analytics process. Unfortunately, their profile – with the expertise to simultaneously act project manager, business analyst, expert programmers, and statistical specialist – is still very rare. And in this new data-driven economy, fewer experts will need to take charge of activities that were previously managed by a full team: “Never (..) was so much owed by so many to so few.”
This article was written in collaboration with Bart Baesens.
Managing Director - Senior Executive Oil and Gas & Shipping
10yLes chiffres sont comme des prisonniers, à force de les torturer ils finissent par dire ce que l'on veut! W. CHURCHILL
CIO | Driving Digital Transformation | GE | Kearney
10yGreat article i would consider cost somwhere in the model as well.