Isolation forest

Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles n...

Full description

Bibliographic Details
Main Authors:	Liu, Fei, Ting, Kaiming, Zhou, Zhi-Hua
Format:	Conference Object
Language:	unknown
Published:	Pisa IEEE Computer Society 2008
Subjects:	0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) Orca
Online Access:	http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294

id	ftfederationuniv:vital:6367
record_format	openpolar
spelling	ftfederationuniv:vital:6367 2023-05-15T17:53:47+02:00 Isolation forest Liu, Fei Ting, Kaiming Zhou, Zhi-Hua 2008 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 unknown Pisa IEEE Computer Society Proceedings of the Eighth IEEE International Conference on Data Mining p. 413-422 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 vital:6367 ISBN:9780769535029 This metadata is freely available under a CCO license 0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) Text Conference paper 2008 ftfederationuniv 2022-12-01T19:04:49Z Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles normal points. To our best knowledge, the concept of isolation has not been explored in current literature. The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement. Our empirical evaluation shows that iForest performs favourably to ORCA, a near-linear time complexity distance-based method, LOF and random forests in terms of AUC and processing time, and especially in large data sets. iForest also works well in high dimensional problems which have a large number of irrelevant attributes, and in situations where training set does not contain any anomalies. Conference Object Orca Federation University Australia: Federation ResearchOnline
institution	Open Polar
collection	Federation University Australia: Federation ResearchOnline
op_collection_id	ftfederationuniv
language	unknown
topic	0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics)
spellingShingle	0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics) Liu, Fei Ting, Kaiming Zhou, Zhi-Hua Isolation forest
topic_facet	0801 Artificial Intelligence and Image Processing Computational complexity Data mining Learning (artificial intelligence) Trees (mathematics)
description	Most existing model-based approaches to anomaly detection construct a profile of normal instances, then identify instances that do not conform to the normal profile as anomalies. This paper proposes a fundamentally different model-based method that explicitly isolates anomalies instead of profiles normal points. To our best knowledge, the concept of isolation has not been explored in current literature. The use of isolation enables the proposed method, iForest, to exploit sub-sampling to an extent that is not feasible in existing methods, creating an algorithm which has a linear time complexity with a low constant and a low memory requirement. Our empirical evaluation shows that iForest performs favourably to ORCA, a near-linear time complexity distance-based method, LOF and random forests in terms of AUC and processing time, and especially in large data sets. iForest also works well in high dimensional problems which have a large number of irrelevant attributes, and in situations where training set does not contain any anomalies.
format	Conference Object
author	Liu, Fei Ting, Kaiming Zhou, Zhi-Hua
author_facet	Liu, Fei Ting, Kaiming Zhou, Zhi-Hua
author_sort	Liu, Fei
title	Isolation forest
title_short	Isolation forest
title_full	Isolation forest
title_fullStr	Isolation forest
title_full_unstemmed	Isolation forest
title_sort	isolation forest
publisher	Pisa IEEE Computer Society
publishDate	2008
url	http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294
genre	Orca
genre_facet	Orca
op_relation	Proceedings of the Eighth IEEE International Conference on Data Mining p. 413-422 http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/69294 vital:6367 ISBN:9780769535029
op_rights	This metadata is freely available under a CCO license
_version_	1766161483981389824

Isolation forest

Similar Items