How to Select Useful Forecasting Methods for "Big-Data"​  Demand Planning and Forecasting Applications

How to Select Useful Forecasting Methods for "Big-Data" Demand Planning and Forecasting Applications

While exploring ways to improve accuracy measurement and forecasting performance for demand planning and forecasting in the supply chain. I commonly find bad data full of every-day anomalies, clear outliers and unusual, un-noted variation that invariably impact results with unintended consequences.

I participated in the M3 competition several decades ago with a PP-Autocast automated batch forecasting program for the 3003 time series in the competition. This DOS-based program used a very flexible family of exponential smoothing algorithms to create "no trend/trend/trend-seasonal" forecast profiles for the M3 competition. The methods were originally shown to be very effective for use with trending telecommunications data. Later, during the 1990's, I used the methodology in the PEER Planner software, a client-server, databased forecast decision support system for 'SKU by location' demand forecasting and planning applications in various supply chain companies. 

With time and experience, I got a bit smarter and more agile at forecasting large datasets (larger than M3) when I could use the software to detect, fix and monitor data quality throughout the entire forecasting process; meaning before, during and after completing a forecasting cycle in the database. In a sequence of recent articles posted on my LinkedIn Profile, I illustrate the importance of identifying, correcting, and evaluating the impact of even a single outlier in a highly seasonal, very forecastable, profile (M3 competition series N1906 is a good example). 

No alt text provided for this image

The primary interest here is to gain some insight into how forecasting Methods can be made more useful (i.e., effective and efficient) for demand planners and forecast practitioners. At the time of the M3 competition, computer power was still too limited for in-depth Exploratory Data Analysis (EDA) and too expensive and time consuming+ for large-scale testing and evaluation. Now, with disrupted economies and pandemic-impacted supply chains, it has become imperative to re-examine the process more critically using EDA.

Point Forecast Measures May Not Be Adequate for Assessing Lead-Time Forecast Accuracy and Performance

The accuracy of profile forecasting can be measured by a ‘distance’ metric between a Forecast Alphabet Profile (FAP) and Actual Alphabet Profile (AAP). To create a Forecast Alphabet Profile (FAP), we divide each forecast value by the sum of the forecasts over a fixed horizon m. Likewise, the Actual Alphabet Profile (AAP) is obtained by dividing each actual by the sum of the actuals over the predetermined time horizon.A performance measure for forecast Profile Accuracy is given by the Kullback-Leibler divergence measure D(a|f). given by:

No alt text provided for this image


where {ai, i = 1, … m} and {fi , i = 1, … m} are the alphabet profiles, respectively. The sum can be interpreted as a measure of ignorance or uncertainty about Profile Accuracy. When D(a|f) = 0, the alphabet profiles overlap, or what we might consider as 100% accuracy.

Using the information-theoretic entropy formulae, we can decompose the Profile Accuracy D(a|f) two components: (1) the Profile Miss and (2) a Profile Relative Skill measure. The calculations involve only arithmetic and the logarithm, so they are easy to work out on spreadsheets. The basic relationship for profile performance measurement is Profile Miss + Profile Relative Skill = Profile Accuracy. 

In a context of accurately aiming darts (forecasts) at a dart board (S&OP meeting), a forecaster using judgment, methods, and models should aim for the bullseye on a dart board. This requires both accuracy and skill level at how consistently the forecaster can get the darts to strike the board nearest the bullseye.

How to Measure Accuracy for a Profile Analysis on a Large Scale

I have selected three methods from the 24 methods used in the M3 competition to demonstrate what similarities and differences can be seen their performance on the same monthly data and with 18 period holdout samples (1428 out of 3003 time series in the M3 dataset)

In supply chain organizations, a variety of accuracy measures are used to evaluate forecasting performance, often for the purpose of finding the best Method (interpreted as the most accurate). While striving for the best Method may not be a best Practice, it is not uncommon for a lead-time forecast to get misinterpreted as a repeated one-step-ahead point forecast. The latter are constructed as a rolling origin. In forecasting with a rolling origin, however, overrides and management adjustments are commonly made over the horizon period and it is then a 'best practice' to use forecast errors (Actual minus Forecast) in some accuracy measure like MAE, MAPE, sMAPE, or MASE to assess performance. Normally, the arithmetic Mean (the M in the accuracy measures), is used for specifying a typical accuracy, which may not be valid use of the arithmetic mean. For lead-time forecasting, it should be better to find a single, objective measure of accuracy for forecast profiles.

Data Quality Matters Because Bad Data Will Beat a Best Forecast Every Time

Step 1. Identify Effective Forecast Profiles

Using a skill measure designed for lead-time forecasts, I obtain a value for every series used by a Method with a new Levenbach L-Skill score defined by

1 - [D(a|Method)/D(a|Benchmark])

The L-Skill score ranges from - ∞ to 1. The Benchmark Naive_2 (used in the M3 competition) has an L-Skill score = 0, so the positive L-Skill scores for the data are associated with effective profile forecasting for the particular Method. In the M3 competition, the N1906 time series, shown above, has an L-Skill score of 0.99, the highest encountered among 19 out of the 24 Methods in M3. Thus, the Methods with the highest percentage of effective profile forecasts should be of interest.

Step 2. Identify Useful Forecast Profiles

An oft-quoted aphorism attributed to George Box states that All Models are Wrong, Some Are Useful. That can be taken to mean that it should be futile to attempt to find a best Method. Rather, what should be a useful Method? We should search for methods that are useful when they are effective (doing the right things fulfilling the requirements of a particular application), as well as efficient (doing things right in converting inputs to outputs) in a particular context, such as retail forecasting, inventory planning, energy forecasting or budget forecasting, etc.

The details of an efficient Method will be explained in a future article, but it results in an Efficiency Frontier, shown in the diagram at the top of this article.

Here is a summary of this exploration for three of the 24 methods used on 1428 monthly time series, each with a single 18-month holdout forecast profile. Method A is effective for about two-thirds of the time series, and highly efficient, as are Methods B and C. With experience, when repeating forecast evaluations over multiple lead-times we can better discern whether differences in the results are important in a particular context.

No alt text provided for this image

Practitioner Takeaways

No alt text provided for this image

  • If you want to select useful forecasting methods for your planning applications, consider first validating these results by repeating multiple profile forecasts with the same time horizon with your own data. 
  • It may even be smarter (more important) to understand the forecast profiles generated by a model than the details of the fitted model. In my experience, greater agility in forecast productivity can be achieved by not dwelling too much on model fitting details (tweaking parameters used with various fitting criteria, for instance), but rather focus on the profiles produced by the Method.
  • The spreadsheet environment is more than adequate for this kind of analysis as all calculations only require arithmetic operations and the logarithms. If you contact me, I will be more than happy to share the spreadsheet for use with your calculations in this article.

No alt text provided for this image

Hans Levenbach, PhD is Owner/CEO of Delphus, Inc and Executive Director, CPDF Professional Development Training and Certification Programs.

No alt text provided for this image

Dr. Hans is the author of a forecasting book (Change&Chance Embraced) recently updated with the LZI method for intermittent demand forecasting in the Supply Chain.

No alt text provided for this image

With endorsement from the International Institute of Forecasters, he created the CPDF certification curriculum for the professional development of demand forecastersand has conducted numerous, hands-on Professional Development Workshops for Demand Planners and Operations Managers in multi-national supply chain companies worldwide.

No alt text provided for this image

The 2021 CPDF Workshop manual is available for self-study, online workshops, or in-house professional development courses.

Hans is a Fellow, Past President and former Treasurer, and member of the Board of Directors of the International Institute of Forecasters.

He is Owner/Manager of these LinkedIn groups: (1) Demand Forecaster Training and Certification, Blended Learning, Predictive Visualization, and (2New Product Forecasting and Innovation Planning, Cognitive Modeling, Predictive Visualization.

I invite you to join these groups and share your thoughts and practical experiences with demand data quality and demand forecasting performance in the supply chain. Feel free to send me the details of your findings, including the underlying data without identifying proprietary descriptions. If possible, I will attempt an independent analysis and see if we can collaborate on something that will be beneficial to everyone.

To view or add a comment, sign in

More articles by Hans Levenbach PhD CPDF

Insights from the community

Others also viewed

Explore topics