Gradient boosting for extremes: sampling theory and application to insurance
This paper develops statistical learning theory for gradient boosting in Peaks-over-Threshold modeling using Generalized Pareto distributions, deriving error bounds and reducing gradient correlation.
Provides error bounds and reparametrization for gradient boosting with GP distributions in Peaks-over-Threshold modeling.
Keywords
Before reading this…
Applications
- →Medical malpractice insurance
To understand this paper, make sure you know these concepts first:
- Statistical learning theoryfind papers →
- Gradient boostingfind papers →
- Generalized Pareto distributionsfind papers →
Abstract
More Like ThisWe develop a statistical learning theory for gradient boosting applied to the estimation of covariate-dependent Generalized Pareto (GP) distributions in the context of Peaks-over-Threshold modeling. After an orthogonal reparametrization of the GP likelihood that diagonalizes its Fisher information matrix, we cast the estimation problem within the Empirical Risk Minimization (ERM) framework and derive non-asymptotic error bounds for the boosting estimator. Our analysis accounts for three distinct sources of error in the process: statistical fluctuations, the approximation bias inherent to the asymptotic nature of the GP model-controlled under second-order regular variation-and the approximation error associated with the finite number of boosting iterates, making explicit the resulting bias-variance trade-off. We illustrate the practical benefits of the reparametrization through simulations, showing that it significantly reduces gradient correlation during training and improves convergence stability. The methodology is applied to a medical malpractice insurance dataset from the Texas Department of Insurance, comprising over 18 000 closed claims. The gradient boosting approach yields a good fit for the tail of settlement cost distributions and reveals that the number of days to settlement is the dominant predictor of tail heaviness, consistent with earlier findings in the reserving literature.