Explaining the results of the M3 forecasting competition

Citation
Mp. Clements et Df. Hendry, Explaining the results of the M3 forecasting competition, INT J FOREC, 17(4), 2001, pp. 550-554
Citations number
16
Categorie Soggetti
Management
Journal title
INTERNATIONAL JOURNAL OF FORECASTING
ISSN journal
01692070 → ACNP
Volume
17
Issue
4
Year of publication
2001
Pages
550 - 554
Database
ISI
SICI code
0169-2070(200110/12)17:4<550:ETROTM>2.0.ZU;2-I
Abstract
Makridakis and Hibon (2000) summarize four main implications of the latest forecasting competition, which we paraphrase as: (a) 'simple methods do bes t'; (b) 'the accuracy measure matters'; (c) 'pooling helps'; and (d) 'the e valuation horizon matters'. We applaud the detailed empirical investigation s, are unsurprised by their summary; but are surprised by the assertion tha t 'the strong empirical evidence, however, has been ignored by theoretical statisticians'. Having successfully published two books and more than a doz en papers across a wide range of journals, which inter alia analyze their f our points, we refute the claim that the issue is being 'ignored', and doub t the implicit suggestion of hostility by the profession. What must be the relationship between the world to be forecast and the mode ls with which we forecast for conditions (a)-(d) not to hold? The research summarized in Clements and Hendry (1998b, 1999) (henceforth CH98 and CH99) shows that in weakly stationary processes, a congruent, encompassing model in-sample will dominate in forecasting at all horizons.(2) When the data ge nerating process (DGP) is complicated, as is likely in economics, then so w ill be the dominant model, subject to possible losses from parameter estima tion (CH98, ch. 12). Causal variables will dominate non-causal (CH99, ch. 1 ), forecast accuracy will deteriorate as the horizon increases, and there w ill be no forecast-accuracy gains from pooling forecasts across methods or models: indeed, pooling refutes encompassing. These are perhaps the 'optima lity' claims that Makridakis and Hibon (2000) correctly doubt are empirical ly relevant. The results of the forecasting competitions are manifestly at odds with suc h strong 'theoretical predictions'. This discrepancy between theory and pra ctice (noted by, e.g., Fildes & Makridakis, 1995), and the systematic mis-f orecasting and forecast failure that has periodically blighted macroeconomi cs, stimulated the research summarized in CH98 and CH99. The 'textbook' par adigm discussed in the previous paragraph offers no explanation for observe d forecast failures, although they have sometimes been attributed to 'mis-s pecified models', 'poor methods', 'inaccurate data', 'incorrect estimation' , 'data-based model selection' and so on, without those claims being proved : our research demonstrates the lack of foundation for such 'explanations'. The reason that (a)-(d) hold in practice is that economies are non-stationa ry and evolving processes which are not reducible to stationarity by differ encing, thereby generating moments that are non-constant over time. Modern economies are regularly subject to major institutional, political, f inancial, legal, fashion, and technological changes which manifest themselv es as structural breaks in models relative to the underlying DGP. Models ar e far from being facsimiles of the DGP, and even if they closely resembled it in-sample, unanticipated structural change could seriously reduce their usefulness for forecasting. Our research suggests that models which are rel atively robust to, or adapt rapidly to, structural change are most likely t o be successful in forecasting. Specifically, shifts in deterministic terms appear to be especially injurious to forecasting, and to be a primary fact or underlying systematic forecast failure, as they cause a shift in the mod el's forecast mean relative to the data mean. Other breaks are surprisingly difficult to detect and have relatively benign effects on forecasts (see H endry & Doornik, 1997; Hendry, 2000). The remaining potential sources of fo recast failure, ranging from model mis-specification, a lack of parsimony - including failure to impose restrictions such as unit roots and cointegrat ion inaccurate forecast-origin data, through to inefficient estimation, may all exacerbate forecast failure, but generally just play supporting roles.