Comparison of IDF estimation methods

D. Veneziano, C. Lepore, A. Langousis, and P. Furcolo, Comparison of IDF estimation methods, International Precipitation Conference (IPC09), Paris, Universite Paris Est, Ecole Nationale des Ponts et Chaussees, 2007.

We compare several estimators of the Intensity-Duration-Frequency (IDF) curves from continuous at-site rainfall records. These include parametric and semi-parametric estimators based on the historical annual maxima (AM estimators), peak-over-threshold (POT) methods, estimators based on the marginal distribution (MD) of rainfall intensity, and hybrid estimators (HY) that combine marginal and annual-maximum information. Comparison is in terms of bias, variance and RMS error. These performance measures vary with the length of the record D, the averaging duration d, and the return period T. The analysis uses subsets of actual and synthetic rainfall records of duration between 24 and 1000 years. The empirical IDF curves from each entire record are used as reference to assess the bias. Another element of comparison is the sensitivity of the estimators to outliers, defined as annual extremes whose estimated return period far exceeds the duration of the record. Broadly speaking, one would expect AM estimators to perform best for very long records, MD estimators to be superior when only a few years of data are available, and POT estimators to be competitive in the intermediate case of a few decades on record. These expectations are based on the qualitative reasoning that POT and MD methods use increasingly large samples (leading to smaller error variance), but the observed variable is increasingly removed from the annual maximum (which may increase the bias). The AM methods assume that the IDF value is a separable function of d and T. Parametric versions of these methods specify the form of these functions except for a few parameters, whereas semi-parametric versions specify only the functional dependence on d. Dependence on T in the parametric case is based on the assumption that the annual maximum has a GEV distribution. We find that the separability assumption is often violated and the IDF estimates for long return periods are highly variable and sensitive to outliers. This is especially true for the semi-parametric methods, which impose loose constraints on the tail of the annual maximum distribution. None of the other methods assumes separability. We apply the POT method assuming that the excess above the threshold has either Pareto or Generalized Pareto (GP) distribution. In the Pareto case, the model is highly constrained and results in very high bias especially for large T (because empirical distributions deviate significantly from Pareto). In the GP case the bias is small, but unless the record is very long the variance is large due to the difficulty of constraining the shape parameter of the distribution. Marginal-distribution methods assume that the rainfall intensities in separate d-intervals are independent and have a lognormal tail. These methods are statistically stable, robust against outliers, and applicable also when the rainfall record is short. Finally, hybrid methods scale the annual maximum distribution from MD analysis such that its mean value coincides with the average of the recorded annual maxima. These estimators are the best-performing ones when the continuous record has a length of a few decades or the record is short but in addition one has a long annual maximum series.