Isolation-Based Anomaly Detection

Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation withou...

Full description

Bibliographic Details
Published in:ACM Transactions on Knowledge Discovery from Data
Main Authors: Liu, Fei Tony, Ting, Kai Ming, Zhou, Zhi-Hua
Other Authors: Ministry of Science and Technology of the People's Republic of China, National Natural Science Foundation of Jiangsu Province, National Natural Science Foundation of China
Format: Article in Journal/Newspaper
Language:English
Published: Association for Computing Machinery (ACM) 2012
Subjects:
Online Access:http://dx.doi.org/10.1145/2133360.2133363
https://dl.acm.org/doi/pdf/10.1145/2133360.2133363
id cracm:10.1145/2133360.2133363
record_format openpolar
spelling cracm:10.1145/2133360.2133363 2024-10-13T14:10:06+00:00 Isolation-Based Anomaly Detection Liu, Fei Tony Ting, Kai Ming Zhou, Zhi-Hua Ministry of Science and Technology of the People's Republic of China National Natural Science Foundation of Jiangsu Province National Natural Science Foundation of China 2012 http://dx.doi.org/10.1145/2133360.2133363 https://dl.acm.org/doi/pdf/10.1145/2133360.2133363 en eng Association for Computing Machinery (ACM) ACM Transactions on Knowledge Discovery from Data volume 6, issue 1, page 1-39 ISSN 1556-4681 1556-472X journal-article 2012 cracm https://doi.org/10.1145/2133360.2133363 2024-09-30T04:01:07Z Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, i Forest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that i Forest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. i Forest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample. Article in Journal/Newspaper Orca ACM Publications (Association for Computing Machinery) ACM Transactions on Knowledge Discovery from Data 6 1 1 39
institution Open Polar
collection ACM Publications (Association for Computing Machinery)
op_collection_id cracm
language English
description Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, i Forest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that i Forest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. i Forest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample.
author2 Ministry of Science and Technology of the People's Republic of China
National Natural Science Foundation of Jiangsu Province
National Natural Science Foundation of China
format Article in Journal/Newspaper
author Liu, Fei Tony
Ting, Kai Ming
Zhou, Zhi-Hua
spellingShingle Liu, Fei Tony
Ting, Kai Ming
Zhou, Zhi-Hua
Isolation-Based Anomaly Detection
author_facet Liu, Fei Tony
Ting, Kai Ming
Zhou, Zhi-Hua
author_sort Liu, Fei Tony
title Isolation-Based Anomaly Detection
title_short Isolation-Based Anomaly Detection
title_full Isolation-Based Anomaly Detection
title_fullStr Isolation-Based Anomaly Detection
title_full_unstemmed Isolation-Based Anomaly Detection
title_sort isolation-based anomaly detection
publisher Association for Computing Machinery (ACM)
publishDate 2012
url http://dx.doi.org/10.1145/2133360.2133363
https://dl.acm.org/doi/pdf/10.1145/2133360.2133363
genre Orca
genre_facet Orca
op_source ACM Transactions on Knowledge Discovery from Data
volume 6, issue 1, page 1-39
ISSN 1556-4681 1556-472X
op_doi https://doi.org/10.1145/2133360.2133363
container_title ACM Transactions on Knowledge Discovery from Data
container_volume 6
container_issue 1
container_start_page 1
op_container_end_page 39
_version_ 1812817255064403968