Isolation forest
Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles n...
Main Authors: | , , |
---|---|
Format: | Conference Object |
Language: | unknown |
Published: |
Pisa IEEE Computer Society
2008
|
Subjects: | |
Online Access: | http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 |
id |
ftfederationuniv:vital:6367 |
---|---|
record_format |
openpolar |
spelling |
ftfederationuniv:vital:6367 2023-05-15T17:53:47+02:00 Isolation forest Liu, Fei Ting, Kaiming Zhou, Zhi-Hua 2008 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 unknown Pisa IEEE Computer Society Proceedings of the Eighth IEEE International Conference on Data Mining p. 413-422 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 vital:6367 ISBN:9780769535029 This metadata is freely available under a CCO license 0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) Text Conference paper 2008 ftfederationuniv 2022-12-01T19:04:49Z Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles normal points. To our best knowledge, the concept of isolation has not been explored in current literature. The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement. Our empirical evaluation shows that iForest performs favourably to ORCA, a near-linear time complexity distance-based method, LOF and random forests in terms of AUC and processing time, and especially in large data sets. iForest also works well in high dimensional problems which have a large number of irrelevant attributes, and in situations where training set does not contain any anomalies. Conference Object Orca Federation University Australia: Federation ResearchOnline |
institution |
Open Polar |
collection |
Federation University Australia: Federation ResearchOnline |
op_collection_id |
ftfederationuniv |
language |
unknown |
topic |
0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) |
spellingShingle |
0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) Liu, Fei Ting, Kaiming Zhou, Zhi-Hua Isolation forest |
topic_facet |
0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) |
description |
Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles normal points. To our best knowledge, the concept of isolation has not been explored in current literature. The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement. Our empirical evaluation shows that iForest performs favourably to ORCA, a near-linear time complexity distance-based method, LOF and random forests in terms of AUC and processing time, and especially in large data sets. iForest also works well in high dimensional problems which have a large number of irrelevant attributes, and in situations where training set does not contain any anomalies. |
format |
Conference Object |
author |
Liu, Fei Ting, Kaiming Zhou, Zhi-Hua |
author_facet |
Liu, Fei Ting, Kaiming Zhou, Zhi-Hua |
author_sort |
Liu, Fei |
title |
Isolation forest |
title_short |
Isolation forest |
title_full |
Isolation forest |
title_fullStr |
Isolation forest |
title_full_unstemmed |
Isolation forest |
title_sort |
isolation forest |
publisher |
Pisa IEEE Computer Society |
publishDate |
2008 |
url |
http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 |
genre |
Orca |
genre_facet |
Orca |
op_relation |
Proceedings of the Eighth IEEE International Conference on Data Mining p. 413-422 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 vital:6367 ISBN:9780769535029 |
op_rights |
This metadata is freely available under a CCO license |
_version_ |
1766161483981389824 |