MODEL FORECASTING
Forecasting is the process of making predictions of the future based on past and present data and most commonly by analysis of trends. A commonplace example might be estimation of some variable of interest at some specified future date. Prediction is a similar, but more general term. Both might refer to formal statistical methods employing time series, cross-sectional or longitudinal data, or alternatively to less formal judgmental methods. Usage can differ between areas of application: for example, in hydrology the terms "forecast" and "forecasting" are sometimes reserved for estimates of values at certain specific future times, while the term "prediction" is used for more general estimates, such as the number of times floods will occur over a long period.
Risk and uncertainty are central to forecasting and prediction; it is generally considered good practice to indicate the degree of uncertainty attaching to forecasts. In any case, the data must be up to date in order for the forecast to be as accurate as possible. In some cases the data used to predict the variable of interest is itself forecast.
Qualitative vs. quantitative methods[edit]
Qualitative forecasting techniques are subjective, based on the opinion and judgment of consumers and experts; they are appropriate when past data are not available. They are usually applied to intermediate- or long-range decisions. Examples of qualitative forecasting methods are[citation needed] informed opinion and judgment, the Delphi method, market research, and historical life-cycle analogy.
Quantitative forecasting models are used to forecast future data as a function of past data. They are appropriate to use when past numerical data is available and when it is reasonable to assume that some of the patterns in the data are expected to continue into the future. These methods are usually applied to short- or intermediate-range decisions. Examples of quantitative forecasting methods are[citation needed] last period demand, simple and weighted N-Period moving averages, simple exponential smoothing, poisson process model based forecasting and multiplicative seasonal indexes. Previous research shows that different methods may lead to different level of forecasting accuracy. For example, GMDH neural network was found to have better forecasting performance than the classical forecasting algorithms such as Single Exponential Smooth, Double Exponential Smooth, ARIMA and back-propagation neural network.
Average approach
In this approach, the predictions of all future values are equal to the mean of the past data. This approach can be used with any sort of data where past data is available. In time series notation:
{\displaystyle {\hat {y}}_{T+h|T}={\bar {y}}=(y_{1}+...+y_{T})/T}
where {\displaystyle y_{1},...,y_{T}} is the past data.
Although the time series notation has been used here, the average approach can also be used for cross-sectional data (when we are predicting unobserved values; values that are not included in the data set). Then, the prediction for unobserved values is the average of the observed values.
Naïve approach
Naïve forecasts are the most cost-effective forecasting model, and provide a benchmark against which more sophisticated models can be compared. This forecasting method is only suitable for time series data. Using the naïve approach, forecasts are produced that are equal to the last observed value. This method works quite well for economic and financial time series, which often have patterns that are difficult to reliably and accurately predict. If the time series is believed to have seasonality, the seasonal naïve approach may be more appropriate where the forecasts are equal to the value from last season. In time series notation:
{\displaystyle {\hat {y}}_{T+h|T}=y_{T}}
Drift method
A variation on the naïve method is to allow the forecasts to increase or decrease over time, where the amount of change over time (called the drift) is set to be the average change seen in the historical data. So the forecast for time {\displaystyle T+h} is given by
{\displaystyle {\hat {y}}_{T+h|T}=y_{T}+{\frac {h}{T-1}}\sum _{t=2}^{T}(y_{t}-y_{t-1})=y_{T}+h\left({\frac {y_{T}-y_{1}}{T-1}}\right).}
This is equivalent to drawing a line between the first and last observation, and extrapolating it into the future.
Seasonal naïve approach
The seasonal naïve method accounts for seasonality by setting each prediction to be equal to the last observed value of the same season. For example, the prediction value for all subsequent months of April will be equal to the previous value observed for April. The forecast for time {\displaystyle T+h} is
{\displaystyle {\hat {y}}_{T+h|T}=y_{T+h-km}}
where {\displaystyle m}=seasonal period and {\displaystyle k} is the smallest integer greater than {\displaystyle (h-1)/m}.
The seasonal naïve method is particularly useful for data that has a very high level of seasonality.
Time series methods
Time series methods use historical data as the basis of estimating future outcomes. They are based on the assumption that past demand history is a good indicator of future demand.
- Moving average
- Weighted moving average
- Exponential smoothing
- Autoregressive moving average (ARMA) (forecasts depend on past values of the variable being forecast and on past prediction errors)
- Autoregressive integrated moving average (ARIMA) (ARMA on the period-to-period change in the forecast variable)
e.g. Box–Jenkins
Seasonal ARIMA or SARIMA or ARIMARCH,
- Extrapolation
- Linear prediction
- Trend estimation (predicting the variable as a linear or polynomial function of time)
- Growth curve (statistics)
- Recurrent neural network
Relational methods
Some forecasting methods try to identify the underlying factors that might influence the variable that is being forecast. For example, including information about climate patterns might improve the ability of a model to predict umbrella sales. Forecasting models often take account of regular seasonal variations. In addition to climate, such variations can also be due to holidays and customs: for example, one might predict that sales of college football apparel will be higher during the football season than during the off season.
Several informal methods used in causal forecasting do not rely solely on the output of mathematical algorithms, but instead use the judgment of the forecaster. Some forecasts take account of past relationships between variables: if one variable has, for example, been approximately linearly related to another for a long period of time, it may be appropriate to extrapolate such a relationship into the future, without necessarily understanding the reasons for the relationship.
Causal methods include:
- Regression analysis includes a large group of methods for predicting future values of a variable using information about other variables. These methods include both parametric (linear or non-linear) and non-parametric techniques.
- Autoregressive moving average with exogenous inputs (ARMAX)
Quantitative forecasting models are often judged against each other by comparing their in-sample or out-of-sample mean square error, although some researchers have advised against this. Different forecasting approaches have different levels of accuracy. For example, it was found in one context that GMDH has higher forecasting accuracy than traditional ARIMA
Judgmental methods
Judgmental forecasting methods incorporate intuitive judgement, opinions and subjective probability estimates. Judgmental forecasting is used in cases where there is lack of historical data or during completely new and unique market conditions.
Judgmental methods include:
- Composite forecasts[citation needed]
- Cooke's method[citation needed]
- Delphi method
- Forecast by analogy
- Scenario building
- Statistical surveys
- Technology forecasting
Artificial intelligence methods
Often these are done today by specialized programs loosely labeled
Other methods
Forecasting accuracy
The forecast error (also known as a residual) is the difference between the actual value and the forecast value for the corresponding period:
{\displaystyle \ E_{t}=Y_{t}-F_{t}}
where E is the forecast error at period t, Y is the actual value at period t, and F is the forecast for period t.
A good forecasting method will yield residuals that are uncorrelated. If there are correlations between residual values, then there is information left in the residuals which should be used in computing forecasts. This can be accomplished by computing the expected value of a residual as a function of the known past residuals, and adjusting the forecast by the amount by which this expected value differs from zero.
A good forecasting method will also have zero mean. If the residuals have a mean other than zero, then the forecasts are biased and can be improved by adjusting the forecasting technique by an additive constant that equals the mean of the unadjusted residuals.