Theoretical and empirical comparison of stochastic and machine learning methods for hydrological processes forecasting

G. Papacharalampous, Theoretical and empirical comparison of stochastic and machine learning methods for hydrological processes forecasting, Postgraduate Thesis, 372 pages, Department of Water Resources and Environmental Engineering – National Technical University of Athens, Athens, October 2016.

Forecasting the future behaviour of hydrological processes is useful in the design and operation of hydraulic engineering works. While the attention given to probabilistic forecasting is growing, there is still large practical and scientific interest in point estimation. It is also a fact that machine learning methods have established themselves as a promising approach to hydrological forecasting and, as a result, research within the field of hydrology often focuses on comparing machine learning methods to classical stochastic methods. The comparisons performed in the literature are usually based on case studies. This thesis conducts a theoretical comparison on the forecasting performance between several classical stochastic and machine learning point estimation methods by performing large-scale computational experiments based on simulations. The purpose of the thesis is to provide generalized results. The theoretical comparison is accompanied by a small-scale empirical comparison to highlight important points. Emphasis is placed on Support Vector Machines (SVM), that consist the most popular new entrant machine learning category in the field of hydrology, while the well-established Neural Networks (NN) are also involved in the comparison. The comparison refers to long-term forecasting on the observation time scale, although short-term forecasting is also useful. As regards the methodology, a total of 28 methods are used, among which 9 are machine learning methods. Six of the latter methods are built using a SVM algorithm and the remaining three using a NN algorithm. 20 simulation experiments are performed, while each of them uses 2 000 simulated time series. The time series are simulated using a stochastic model from the frequently used families of models Autoregressive Moving Average (ARMA), Autoregressive Integrated Moving Average (ARIMA), Autoregressive Fractionally Integrated Moving Average (ARFIMA) and Seasonal Autoregressive Integrated Moving Average (SARIMA). Additionally, 8 computational experiments are carried out, each using one historical time series. Each time series is divided into two parts. The first part is used for training the model and the second for testing its forecast. The comparative assessment of the methods is based on 22 metrics, that quantify the methods’ performance according to several criteria. These criteria are related to the bias with respect to the mean and standard deviation, the accuracy and the correlation. The most important outcome of this thesis is that in general there is not a uniformly better or worse method. However, there are methods that are regularly better or worse than others according to specific metrics. It appears that, although a general ranking of the methods is not possible, their classification based on their similar or contrasting performance in the various metrics is possible to some extent. Another important conclusion is that more sophisticated methods do not necessarily provide better forecasts compared to simpler methods. It is pointed out that machine learning methods do not differ dramatically from classical stochastic methods, while it is interesting that the SVM and NN algorithms used in this thesis offer potentially very good performance in terms of accuracy, compared to the overall picture. It should be noted that, although the present thesis focuses on hydrological processes, the results are of general scientific interest and they also concern all possible observation time scales. In addition to the use of simulated processes, another important point in the present thesis is the use of several methods and metrics. Using fewer methods and fewer metrics would have led to a very different overall picture, particularly if those fewer metrics corresponded to fewer criteria. For this specific reason, the proposed methodology of the thesis is considered to be more appropriate for the evaluation of forecasting methods.