Isolation-based anomaly detection
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without e...
Published in: | ACM Transactions on Knowledge Discovery from Data |
---|---|
Main Authors: | , , |
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
2012
|
Subjects: | |
Online Access: | http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 https://doi.org/10.1145/2133360.2133363 |
id |
ftfederationuniv:vital:6369 |
---|---|
record_format |
openpolar |
spelling |
ftfederationuniv:vital:6369 2023-05-15T17:53:44+02:00 Isolation-based anomaly detection Liu, Fei Ting, Kaiming Zhou, Zhi-Hua 2012 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 https://doi.org/10.1145/2133360.2133363 unknown ACM Transactions on Knowledge Discovery from Data Vol. 6, no. 1 (2012), p. 1-39 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 vital:6369 https://doi.org/10.1145/2133360.2133363 ISSN:1556-4681 Copyright Association for Computing Machinery, Inc. This metadata is freely available under a CCO license Information systems Information systems applications Data mining Computing methodologies Machine learning Text Journal article 2012 ftfederationuniv https://doi.org/10.1145/2133360.2133363 2022-12-01T19:04:49Z Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample Article in Journal/Newspaper Orca Federation University Australia: Federation ResearchOnline ACM Transactions on Knowledge Discovery from Data 6 1 1 39 |
institution |
Open Polar |
collection |
Federation University Australia: Federation ResearchOnline |
op_collection_id |
ftfederationuniv |
language |
unknown |
topic |
Information systems Information systems applications Data mining Computing methodologies Machine learning |
spellingShingle |
Information systems Information systems applications Data mining Computing methodologies Machine learning Liu, Fei Ting, Kaiming Zhou, Zhi-Hua Isolation-based anomaly detection |
topic_facet |
Information systems Information systems applications Data mining Computing methodologies Machine learning |
description |
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample |
format |
Article in Journal/Newspaper |
author |
Liu, Fei Ting, Kaiming Zhou, Zhi-Hua |
author_facet |
Liu, Fei Ting, Kaiming Zhou, Zhi-Hua |
author_sort |
Liu, Fei |
title |
Isolation-based anomaly detection |
title_short |
Isolation-based anomaly detection |
title_full |
Isolation-based anomaly detection |
title_fullStr |
Isolation-based anomaly detection |
title_full_unstemmed |
Isolation-based anomaly detection |
title_sort |
isolation-based anomaly detection |
publishDate |
2012 |
url |
http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 https://doi.org/10.1145/2133360.2133363 |
genre |
Orca |
genre_facet |
Orca |
op_relation |
ACM Transactions on Knowledge Discovery from Data Vol. 6, no. 1 (2012), p. 1-39 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 vital:6369 https://doi.org/10.1145/2133360.2133363 ISSN:1556-4681 |
op_rights |
Copyright Association for Computing Machinery, Inc. This metadata is freely available under a CCO license |
op_doi |
https://doi.org/10.1145/2133360.2133363 |
container_title |
ACM Transactions on Knowledge Discovery from Data |
container_volume |
6 |
container_issue |
1 |
container_start_page |
1 |
op_container_end_page |
39 |
_version_ |
1766161446169739264 |