Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
Abstract—This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.231.3872 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.231.3872 2023-05-15T17:53:44+02:00 Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking Andrew Foss Osmar R. Zaïane Sandra Zilles The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf text ftciteseerx 2016-01-07T18:45:42Z Abstract—This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the ‘rare classes ’ case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods. Keywords-Outlier Detection; Classification; Subspaces. I. Text Orca Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
description |
Abstract—This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the ‘rare classes ’ case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods. Keywords-Outlier Detection; Classification; Subspaces. I. |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Andrew Foss Osmar R. Zaïane Sandra Zilles |
spellingShingle |
Andrew Foss Osmar R. Zaïane Sandra Zilles Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking |
author_facet |
Andrew Foss Osmar R. Zaïane Sandra Zilles |
author_sort |
Andrew Foss |
title |
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking |
title_short |
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking |
title_full |
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking |
title_fullStr |
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking |
title_full_unstemmed |
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking |
title_sort |
unsupervised class separation of multivariate data through cumulative variance-based ranking |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf |
genre |
Orca |
genre_facet |
Orca |
op_source |
http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766161434445611008 |