Temporal cross-validation in forecasting: A case study of COVID-19 incidence using wastewater data
dc.contributor.author | Lai, M. | |
dc.contributor.author | Wulff, S. S. | |
dc.contributor.author | Cao, Y. | |
dc.contributor.author | Robinson, T. J. | |
dc.contributor.author | Rajapaksha, R. | |
dc.date.accessioned | 2024-12-16T06:25:44Z | |
dc.date.available | 2024-12-16T06:25:44Z | |
dc.date.issued | 2024-11 | |
dc.description.abstract | Two predominant methodologies in forecasting temporal processes include traditional time series models and machine learning methods. This paper investigates the impact of time series cross-validation (TSCV) on both approaches in the context of a case study predicting the incidence of COVID-19 based on wastewater data. The TSCV framework outlined in the paper begins by engineering interpretable features hypothesized as potential predictors of COVID-19 incidence. Feature selection and hyperparameter tuning are then utilized with TSCV to identify the best features and hyperparameters for optimal model performance given a specific forecast horizon. While evidence supporting the utility of TSCV for auto-regressive integrated moving average model with exogenous variables (TS-ARIMAX) forecasts is lacking in this study, such an approach proves advantageous for gradient boosting machine forecasts (TS-GBM). In Wyoming, for instance, TS-GBM had a 34.9% improvement compared to naïve predictions, whereas GBM without TSCV only had a 15.6% improvement. However, TSCV also enhances interpretability for both TS-ARIMAX and TS-GBM models as this approach selects specific features, such as lagged values of COVID-19 cases, based on forecast performance and forecast length. Future research should work to explore the influence of stationarity and model averaging on the performance of TSCV in forecasting applications. | en_US |
dc.identifier.citation | Lai, M., Wulff, S. S., Cao, Y., Robinson, T. J., & Rajapaksha, R. (2024). Temporal cross‐validation in forecasting: A case study of COVID‐19 incidence using wastewater data. Quality and Reliability Engineering International. https://doi.org/10.1002/qre.3686 | en_US |
dc.identifier.uri | http://repository.kln.ac.lk/handle/123456789/28968 | |
dc.publisher | Quality and Reliability Engineering International | en_US |
dc.title | Temporal cross-validation in forecasting: A case study of COVID-19 incidence using wastewater data | en_US |