Stability selection for component-wise gradient boosting in multiple dimensions

We present a new algorithm for boosting generalized additive models for location, scale and shape (GAMLSS) that allows to incorporate stability selection, an increasingly popular way to obtain stable sets of covariates while controlling the per-family error rate (PFER). The model is fitted repeatedl...

Full description

Bibliographic Details
Main Authors: Thomas, Janek, Mayr, Andreas, Bischl, Bernd, Schmid, Matthias, Smith, Adam, Hofner, Benjamin
Format: Text
Language:unknown
Published: arXiv 2016
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.1611.10171
https://arxiv.org/abs/1611.10171
id ftdatacite:10.48550/arxiv.1611.10171
record_format openpolar
spelling ftdatacite:10.48550/arxiv.1611.10171 2023-05-15T15:55:57+02:00 Stability selection for component-wise gradient boosting in multiple dimensions Thomas, Janek Mayr, Andreas Bischl, Bernd Schmid, Matthias Smith, Adam Hofner, Benjamin 2016 https://dx.doi.org/10.48550/arxiv.1611.10171 https://arxiv.org/abs/1611.10171 unknown arXiv https://dx.doi.org/10.1007/s11222-017-9754-6 arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computation stat.CO Machine Learning stat.ML FOS Computer and information sciences article-journal Article ScholarlyArticle Text 2016 ftdatacite https://doi.org/10.48550/arxiv.1611.10171 https://doi.org/10.1007/s11222-017-9754-6 2022-04-01T11:18:20Z We present a new algorithm for boosting generalized additive models for location, scale and shape (GAMLSS) that allows to incorporate stability selection, an increasingly popular way to obtain stable sets of covariates while controlling the per-family error rate (PFER). The model is fitted repeatedly to subsampled data and variables with high selection frequencies are extracted. To apply stability selection to boosted GAMLSS, we develop a new "noncyclical" fitting algorithm that incorporates an additional selection step of the best-fitting distribution parameter in each iteration. This new algorithms has the additional advantage that optimizing the tuning parameters of boosting is reduced from a multi-dimensional to a one-dimensional problem with vastly decreased complexity. The performance of the novel algorithm is evaluated in an extensive simulation study. We apply this new algorithm to a study to estimate abundance of common eider in Massachusetts, USA, featuring excess zeros, overdispersion, non-linearity and spatio-temporal structures. Eider abundance is estimated via boosted GAMLSS, allowing both mean and overdispersion to be regressed on covariates. Stability selection is used to obtain a sparse set of stable predictors. : 16 pages Text Common Eider DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computation stat.CO
Machine Learning stat.ML
FOS Computer and information sciences
spellingShingle Computation stat.CO
Machine Learning stat.ML
FOS Computer and information sciences
Thomas, Janek
Mayr, Andreas
Bischl, Bernd
Schmid, Matthias
Smith, Adam
Hofner, Benjamin
Stability selection for component-wise gradient boosting in multiple dimensions
topic_facet Computation stat.CO
Machine Learning stat.ML
FOS Computer and information sciences
description We present a new algorithm for boosting generalized additive models for location, scale and shape (GAMLSS) that allows to incorporate stability selection, an increasingly popular way to obtain stable sets of covariates while controlling the per-family error rate (PFER). The model is fitted repeatedly to subsampled data and variables with high selection frequencies are extracted. To apply stability selection to boosted GAMLSS, we develop a new "noncyclical" fitting algorithm that incorporates an additional selection step of the best-fitting distribution parameter in each iteration. This new algorithms has the additional advantage that optimizing the tuning parameters of boosting is reduced from a multi-dimensional to a one-dimensional problem with vastly decreased complexity. The performance of the novel algorithm is evaluated in an extensive simulation study. We apply this new algorithm to a study to estimate abundance of common eider in Massachusetts, USA, featuring excess zeros, overdispersion, non-linearity and spatio-temporal structures. Eider abundance is estimated via boosted GAMLSS, allowing both mean and overdispersion to be regressed on covariates. Stability selection is used to obtain a sparse set of stable predictors. : 16 pages
format Text
author Thomas, Janek
Mayr, Andreas
Bischl, Bernd
Schmid, Matthias
Smith, Adam
Hofner, Benjamin
author_facet Thomas, Janek
Mayr, Andreas
Bischl, Bernd
Schmid, Matthias
Smith, Adam
Hofner, Benjamin
author_sort Thomas, Janek
title Stability selection for component-wise gradient boosting in multiple dimensions
title_short Stability selection for component-wise gradient boosting in multiple dimensions
title_full Stability selection for component-wise gradient boosting in multiple dimensions
title_fullStr Stability selection for component-wise gradient boosting in multiple dimensions
title_full_unstemmed Stability selection for component-wise gradient boosting in multiple dimensions
title_sort stability selection for component-wise gradient boosting in multiple dimensions
publisher arXiv
publishDate 2016
url https://dx.doi.org/10.48550/arxiv.1611.10171
https://arxiv.org/abs/1611.10171
genre Common Eider
genre_facet Common Eider
op_relation https://dx.doi.org/10.1007/s11222-017-9754-6
op_rights arXiv.org perpetual, non-exclusive license
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi https://doi.org/10.48550/arxiv.1611.10171
https://doi.org/10.1007/s11222-017-9754-6
_version_ 1766391423041536000