MLconf, Atlanta

September 15, 2017

While models generated from cross-sectional data can utilize cross-validation for model selection, most time series models cannot be cross-validated due to the temporal structure of the data used to create them. It is possible to employ a rolling cross-validation technique, however this process is computationally expensive and provides no indication of the long-term forecast accuracies of the models.

The purpose of this talk is to elaborate how decision theory can be used to automate time series model selection in order to streamline the manual process of validation and testing. By creating consecutive, temporally independent holdout sets, performance metrics for each model’s prediction on each holdout set are fed into a decision function to select an unbiased model. The decision function helps minimize the poorest performance of each model across all holdout sets in order to counteract the possibility of choosing a model that overfits or underfits the holdout sets. Not only does this process improve forecast accuracy, but it also reduces computation time by only requiring the creation of a fixed number of proposed forecasting models.