Isolation-Based Anomaly Detection
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation withou...
Published in: | ACM Transactions on Knowledge Discovery from Data |
---|---|
Main Authors: | , , |
Other Authors: | , , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Association for Computing Machinery (ACM)
2012
|
Subjects: | |
Online Access: | http://dx.doi.org/10.1145/2133360.2133363 https://dl.acm.org/doi/pdf/10.1145/2133360.2133363 |
id |
cracm:10.1145/2133360.2133363 |
---|---|
record_format |
openpolar |
spelling |
cracm:10.1145/2133360.2133363 2024-10-13T14:10:06+00:00 Isolation-Based Anomaly Detection Liu, Fei Tony Ting, Kai Ming Zhou, Zhi-Hua Ministry of Science and Technology of the People's Republic of China National Natural Science Foundation of Jiangsu Province National Natural Science Foundation of China 2012 http://dx.doi.org/10.1145/2133360.2133363 https://dl.acm.org/doi/pdf/10.1145/2133360.2133363 en eng Association for Computing Machinery (ACM) ACM Transactions on Knowledge Discovery from Data volume 6, issue 1, page 1-39 ISSN 1556-4681 1556-472X journal-article 2012 cracm https://doi.org/10.1145/2133360.2133363 2024-09-30T04:01:07Z Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, i Forest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that i Forest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. i Forest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample. Article in Journal/Newspaper Orca ACM Publications (Association for Computing Machinery) ACM Transactions on Knowledge Discovery from Data 6 1 1 39 |
institution |
Open Polar |
collection |
ACM Publications (Association for Computing Machinery) |
op_collection_id |
cracm |
language |
English |
description |
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, i Forest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that i Forest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. i Forest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample. |
author2 |
Ministry of Science and Technology of the People's Republic of China National Natural Science Foundation of Jiangsu Province National Natural Science Foundation of China |
format |
Article in Journal/Newspaper |
author |
Liu, Fei Tony Ting, Kai Ming Zhou, Zhi-Hua |
spellingShingle |
Liu, Fei Tony Ting, Kai Ming Zhou, Zhi-Hua Isolation-Based Anomaly Detection |
author_facet |
Liu, Fei Tony Ting, Kai Ming Zhou, Zhi-Hua |
author_sort |
Liu, Fei Tony |
title |
Isolation-Based Anomaly Detection |
title_short |
Isolation-Based Anomaly Detection |
title_full |
Isolation-Based Anomaly Detection |
title_fullStr |
Isolation-Based Anomaly Detection |
title_full_unstemmed |
Isolation-Based Anomaly Detection |
title_sort |
isolation-based anomaly detection |
publisher |
Association for Computing Machinery (ACM) |
publishDate |
2012 |
url |
http://dx.doi.org/10.1145/2133360.2133363 https://dl.acm.org/doi/pdf/10.1145/2133360.2133363 |
genre |
Orca |
genre_facet |
Orca |
op_source |
ACM Transactions on Knowledge Discovery from Data volume 6, issue 1, page 1-39 ISSN 1556-4681 1556-472X |
op_doi |
https://doi.org/10.1145/2133360.2133363 |
container_title |
ACM Transactions on Knowledge Discovery from Data |
container_volume |
6 |
container_issue |
1 |
container_start_page |
1 |
op_container_end_page |
39 |
_version_ |
1812817255064403968 |