Time series forecasting using machine learning algorithms has gained popularity recently. Random forest is a machine learning algorithm implemented in time series forecasting; however, most of its forecasting properties have remained unexplored. Here we focus on assessing the performance of random forests in one-step forecasting using two large datasets of short time series with the aim to suggest an optimal set of predictor variables. Furthermore, we compare its performance to benchmarking methods. The first dataset is composed by 16,000 simulated time series from a variety of Autoregressive Fractionally Integrated Moving Average (ARFIMA) models. The second dataset consists of 135 mean annual temperature time series. The highest predictive performance of RF is observed when using a low number of recent lagged predictor variables. This outcome could be useful in relevant future applications, with the prospect to achieve higher predictive accuracy.
Full text (5509 KB)
Our works that reference this work:
|1.||G. Papacharalampous, H. Tyralis, and D. Koutsoyiannis, One-step ahead forecasting of geophysical processes within a purely statistical framework, Geoscience Letters, 5, 12, doi:10.1186/s40562-018-0111-1, 2018.|
|2.||G. Papacharalampous, H. Tyralis, and D. Koutsoyiannis, Univariate time series forecasting of temperature and precipitation with a focus on machine learning algorithms: a multiple-case study from Greece, Water Resources Management, 32 (15), 5207–5239, doi:10.1007/s11269-018-2155-6, 2018.|
|3.||G. Papacharalampous, H. Tyralis, and D. Koutsoyiannis, Comparison of stochastic and machine learning methods for multi-step ahead forecasting of hydrological processes, Stochastic Environmental Research & Risk Assessment, doi:10.1007/s00477-018-1638-6, 2019.|