Monash University and
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This paper proposes a method called Isolation Forest (iForest) which detects anomalies purely based on the concept of isolation without empl...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.5779 http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.673.5779 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.673.5779 2023-05-15T17:53:48+02:00 Monash University and Fei Tony Liu Kai Ming Ting Zhi-hua Zhou The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.5779 http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.5779 http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf Categories and Subject Descriptors H.2.8 [Database Management Database Applications— Data Mining I.2.6 [Artificial Intelligence Learning General Terms Algorithm Design Experimentation Additional Key Words and Phrases Anomaly detection outlier detection ensemble methods binary tree text ftciteseerx 2016-01-08T17:29:22Z Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This paper proposes a method called Isolation Forest (iForest) which detects anomalies purely based on the concept of isolation without employing any distance or density measure—fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement, and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample. Text Orca Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
topic |
Categories and Subject Descriptors H.2.8 [Database Management Database Applications— Data Mining I.2.6 [Artificial Intelligence Learning General Terms Algorithm Design Experimentation Additional Key Words and Phrases Anomaly detection outlier detection ensemble methods binary tree |
spellingShingle |
Categories and Subject Descriptors H.2.8 [Database Management Database Applications— Data Mining I.2.6 [Artificial Intelligence Learning General Terms Algorithm Design Experimentation Additional Key Words and Phrases Anomaly detection outlier detection ensemble methods binary tree Fei Tony Liu Kai Ming Ting Zhi-hua Zhou Monash University and |
topic_facet |
Categories and Subject Descriptors H.2.8 [Database Management Database Applications— Data Mining I.2.6 [Artificial Intelligence Learning General Terms Algorithm Design Experimentation Additional Key Words and Phrases Anomaly detection outlier detection ensemble methods binary tree |
description |
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This paper proposes a method called Isolation Forest (iForest) which detects anomalies purely based on the concept of isolation without employing any distance or density measure—fundamentally different from all existing methods. As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement, and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample. |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Fei Tony Liu Kai Ming Ting Zhi-hua Zhou |
author_facet |
Fei Tony Liu Kai Ming Ting Zhi-hua Zhou |
author_sort |
Fei Tony Liu |
title |
Monash University and |
title_short |
Monash University and |
title_full |
Monash University and |
title_fullStr |
Monash University and |
title_full_unstemmed |
Monash University and |
title_sort |
monash university and |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.5779 http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf |
genre |
Orca |
genre_facet |
Orca |
op_source |
http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.5779 http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/tkdd11.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766161508594614272 |