Influence of Hyperparameters on Random Forest Accuracy
In this paper we present our work on the Random Forest (RF) family of classification methods. Our goal is to go one step further in the understanding of RF mechanisms by studying the parametrization of the reference algorithm Forest-RI. In this algorithm, a randomization principle is used during the...
Main Authors: | , , |
---|---|
Other Authors: | , , , , , , |
Format: | Conference Object |
Language: | English |
Published: |
HAL CCSD
2009
|
Subjects: | |
Online Access: | https://hal.science/hal-00436358 https://hal.science/hal-00436358/document https://hal.science/hal-00436358/file/mcs09.pdf https://doi.org/10.1007/978-3-642-02326-2_18 |
id |
ftnormandieuniv:oai:HAL:hal-00436358v1 |
---|---|
record_format |
openpolar |
spelling |
ftnormandieuniv:oai:HAL:hal-00436358v1 2024-01-21T10:07:24+01:00 Influence of Hyperparameters on Random Forest Accuracy Bernard, Simon Heutte, Laurent Adam, Sébastien Equipe Apprentissage (DocApp - LITIS) Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes (LITIS) Université Le Havre Normandie (ULH) Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN) Normandie Université (NU)-Institut national des sciences appliquées Rouen Normandie (INSA Rouen Normandie) Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)-Université Le Havre Normandie (ULH) Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA) Reykjavik, Iceland 2009-06-10 https://hal.science/hal-00436358 https://hal.science/hal-00436358/document https://hal.science/hal-00436358/file/mcs09.pdf https://doi.org/10.1007/978-3-642-02326-2_18 en eng HAL CCSD Springer info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-642-02326-2_18 hal-00436358 https://hal.science/hal-00436358 https://hal.science/hal-00436358/document https://hal.science/hal-00436358/file/mcs09.pdf doi:10.1007/978-3-642-02326-2_18 info:eu-repo/semantics/OpenAccess International Workshop on Multiple Classifier Systems (MCS) https://hal.science/hal-00436358 International Workshop on Multiple Classifier Systems (MCS), Jun 2009, Reykjavik, Iceland. pp.171-180, ⟨10.1007/978-3-642-02326-2_18⟩ [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] info:eu-repo/semantics/conferenceObject Conference papers 2009 ftnormandieuniv https://doi.org/10.1007/978-3-642-02326-2_18 2023-12-26T23:40:02Z In this paper we present our work on the Random Forest (RF) family of classification methods. Our goal is to go one step further in the understanding of RF mechanisms by studying the parametrization of the reference algorithm Forest-RI. In this algorithm, a randomization principle is used during the tree induction process, that randomly selects K features at each node, among which the best split is chosen. The strength of randomization in the tree induction is thus led by the hyperparameter K which plays an important role for building accurate RF classifiers. We have decided to focus our experimental study on this hyperparameter and on its influence on classification accuracy. For that purpose, we have evaluated the Forest-RI algorithm on several machine learning problems and with different settings of K in order to understand the way it acts on RF performance. We show that default values of K traditionally used in the literature are globally near-optimal, except for some cases for which they are all significatively sub-optimal. Thus additional experiments have been led on those datasets, that highlight the crucial role played by feature relevancy in finding the optimal setting of K. Conference Object Iceland Normandie Université: HAL 171 180 |
institution |
Open Polar |
collection |
Normandie Université: HAL |
op_collection_id |
ftnormandieuniv |
language |
English |
topic |
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] |
spellingShingle |
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] Bernard, Simon Heutte, Laurent Adam, Sébastien Influence of Hyperparameters on Random Forest Accuracy |
topic_facet |
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] |
description |
In this paper we present our work on the Random Forest (RF) family of classification methods. Our goal is to go one step further in the understanding of RF mechanisms by studying the parametrization of the reference algorithm Forest-RI. In this algorithm, a randomization principle is used during the tree induction process, that randomly selects K features at each node, among which the best split is chosen. The strength of randomization in the tree induction is thus led by the hyperparameter K which plays an important role for building accurate RF classifiers. We have decided to focus our experimental study on this hyperparameter and on its influence on classification accuracy. For that purpose, we have evaluated the Forest-RI algorithm on several machine learning problems and with different settings of K in order to understand the way it acts on RF performance. We show that default values of K traditionally used in the literature are globally near-optimal, except for some cases for which they are all significatively sub-optimal. Thus additional experiments have been led on those datasets, that highlight the crucial role played by feature relevancy in finding the optimal setting of K. |
author2 |
Equipe Apprentissage (DocApp - LITIS) Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes (LITIS) Université Le Havre Normandie (ULH) Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN) Normandie Université (NU)-Institut national des sciences appliquées Rouen Normandie (INSA Rouen Normandie) Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)-Université Le Havre Normandie (ULH) Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA) |
format |
Conference Object |
author |
Bernard, Simon Heutte, Laurent Adam, Sébastien |
author_facet |
Bernard, Simon Heutte, Laurent Adam, Sébastien |
author_sort |
Bernard, Simon |
title |
Influence of Hyperparameters on Random Forest Accuracy |
title_short |
Influence of Hyperparameters on Random Forest Accuracy |
title_full |
Influence of Hyperparameters on Random Forest Accuracy |
title_fullStr |
Influence of Hyperparameters on Random Forest Accuracy |
title_full_unstemmed |
Influence of Hyperparameters on Random Forest Accuracy |
title_sort |
influence of hyperparameters on random forest accuracy |
publisher |
HAL CCSD |
publishDate |
2009 |
url |
https://hal.science/hal-00436358 https://hal.science/hal-00436358/document https://hal.science/hal-00436358/file/mcs09.pdf https://doi.org/10.1007/978-3-642-02326-2_18 |
op_coverage |
Reykjavik, Iceland |
genre |
Iceland |
genre_facet |
Iceland |
op_source |
International Workshop on Multiple Classifier Systems (MCS) https://hal.science/hal-00436358 International Workshop on Multiple Classifier Systems (MCS), Jun 2009, Reykjavik, Iceland. pp.171-180, ⟨10.1007/978-3-642-02326-2_18⟩ |
op_relation |
info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-642-02326-2_18 hal-00436358 https://hal.science/hal-00436358 https://hal.science/hal-00436358/document https://hal.science/hal-00436358/file/mcs09.pdf doi:10.1007/978-3-642-02326-2_18 |
op_rights |
info:eu-repo/semantics/OpenAccess |
op_doi |
https://doi.org/10.1007/978-3-642-02326-2_18 |
container_start_page |
171 |
op_container_end_page |
180 |
_version_ |
1788697960983298048 |