Isolation-based anomaly detection

Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without e...

Full description

Bibliographic Details
Published in:	ACM Transactions on Knowledge Discovery from Data
Main Authors:	Liu, Fei, Ting, Kaiming, Zhou, Zhi-Hua
Format:	Article in Journal/Newspaper
Language:	unknown
Published:	2012
Subjects:	Information systems Information systems applications Data mining Computing methodologies Machine learning Orca
Online Access:	http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/34102 https://doi.org/10.1145/2133360.2133363

Description
Summary:	Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample

Isolation-based anomaly detection

Similar Items