Isolation-based anomaly detection

Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without e...

Full description

Bibliographic Details
Published in:ACM Transactions on Knowledge Discovery from Data
Main Authors: Liu, Fei, Ting, Kaiming, Zhou, Zhi-Hua
Format: Article in Journal/Newspaper
Language:unknown
Published: 2012
Subjects:
Online Access:http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102
https://doi.org/10.1145/2133360.2133363
id ftfederationuniv:vital:6369
record_format openpolar
spelling ftfederationuniv:vital:6369 2023-05-15T17:53:44+02:00 Isolation-based anomaly detection Liu, Fei Ting, Kaiming Zhou, Zhi-Hua 2012 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 https://doi.org/10.1145/2133360.2133363 unknown ACM Transactions on Knowledge Discovery from Data Vol. 6, no. 1 (2012), p. 1-39 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 vital:6369 https://doi.org/10.1145/2133360.2133363 ISSN:1556-4681 Copyright Association for Computing Machinery, Inc. This metadata is freely available under a CCO license Information systems Information systems applications Data mining Computing methodologies Machine learning Text Journal article 2012 ftfederationuniv https://doi.org/10.1145/2133360.2133363 2022-12-01T19:04:49Z Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample Article in Journal/Newspaper Orca Federation University Australia: Federation ResearchOnline ACM Transactions on Knowledge Discovery from Data 6 1 1 39
institution Open Polar
collection Federation University Australia: Federation ResearchOnline
op_collection_id ftfederationuniv
language unknown
topic Information systems
Information systems applications
Data mining
Computing methodologies
Machine learning
spellingShingle Information systems
Information systems applications
Data mining
Computing methodologies
Machine learning
Liu, Fei
Ting, Kaiming
Zhou, Zhi-Hua
Isolation-based anomaly detection
topic_facet Information systems
Information systems applications
Data mining
Computing methodologies
Machine learning
description Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample
format Article in Journal/Newspaper
author Liu, Fei
Ting, Kaiming
Zhou, Zhi-Hua
author_facet Liu, Fei
Ting, Kaiming
Zhou, Zhi-Hua
author_sort Liu, Fei
title Isolation-based anomaly detection
title_short Isolation-based anomaly detection
title_full Isolation-based anomaly detection
title_fullStr Isolation-based anomaly detection
title_full_unstemmed Isolation-based anomaly detection
title_sort isolation-based anomaly detection
publishDate 2012
url http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102
https://doi.org/10.1145/2133360.2133363
genre Orca
genre_facet Orca
op_relation ACM Transactions on Knowledge Discovery from Data Vol. 6, no. 1 (2012), p. 1-39
http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102
vital:6369
https://doi.org/10.1145/2133360.2133363
ISSN:1556-4681
op_rights Copyright Association for Computing Machinery, Inc.
This metadata is freely available under a CCO license
op_doi https://doi.org/10.1145/2133360.2133363
container_title ACM Transactions on Knowledge Discovery from Data
container_volume 6
container_issue 1
container_start_page 1
op_container_end_page 39
_version_ 1766161446169739264