Isolation forest

Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles n...

Full description

Bibliographic Details
Main Authors: Liu, Fei, Ting, Kaiming, Zhou, Zhi-Hua
Format: Conference Object
Language:unknown
Published: Pisa IEEE Computer Society 2008
Subjects:
Online Access:http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294
id ftfederationuniv:vital:6367
record_format openpolar
spelling ftfederationuniv:vital:6367 2023-05-15T17:53:47+02:00 Isolation forest Liu, Fei Ting, Kaiming Zhou, Zhi-Hua 2008 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 unknown Pisa IEEE Computer Society Proceedings of the Eighth IEEE International Conference on Data Mining p. 413-422 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 vital:6367 ISBN:9780769535029 This metadata is freely available under a CCO license 0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) Text Conference paper 2008 ftfederationuniv 2022-12-01T19:04:49Z Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles normal points. To our best knowledge, the concept of isolation has not been explored in current literature. The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement. Our empirical evaluation shows that iForest performs favourably to ORCA, a near-linear time complexity distance-based method, LOF and random forests in terms of AUC and processing time, and especially in large data sets. iForest also works well in high dimensional problems which have a large number of irrelevant attributes, and in situations where training set does not contain any anomalies. Conference Object Orca Federation University Australia: Federation ResearchOnline
institution Open Polar
collection Federation University Australia: Federation ResearchOnline
op_collection_id ftfederationuniv
language unknown
topic 0801 Artificial Intelligence and Image Processing
Computational complexity
Data mining
Learning (artificial intelligence)
Trees (mathematics)
spellingShingle 0801 Artificial Intelligence and Image Processing
Computational complexity
Data mining
Learning (artificial intelligence)
Trees (mathematics)
Liu, Fei
Ting, Kaiming
Zhou, Zhi-Hua
Isolation forest
topic_facet 0801 Artificial Intelligence and Image Processing
Computational complexity
Data mining
Learning (artificial intelligence)
Trees (mathematics)
description Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles normal points. To our best knowledge, the concept of isolation has not been explored in current literature. The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement. Our empirical evaluation shows that iForest performs favourably to ORCA, a near-linear time complexity distance-based method, LOF and random forests in terms of AUC and processing time, and especially in large data sets. iForest also works well in high dimensional problems which have a large number of irrelevant attributes, and in situations where training set does not contain any anomalies.
format Conference Object
author Liu, Fei
Ting, Kaiming
Zhou, Zhi-Hua
author_facet Liu, Fei
Ting, Kaiming
Zhou, Zhi-Hua
author_sort Liu, Fei
title Isolation forest
title_short Isolation forest
title_full Isolation forest
title_fullStr Isolation forest
title_full_unstemmed Isolation forest
title_sort isolation forest
publisher Pisa IEEE Computer Society
publishDate 2008
url http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294
genre Orca
genre_facet Orca
op_relation Proceedings of the Eighth IEEE International Conference on Data Mining p. 413-422
http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294
vital:6367
ISBN:9780769535029
op_rights This metadata is freely available under a CCO license
_version_ 1766161483981389824