Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking

Abstract—This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend...

Full description

Bibliographic Details
Main Authors: Andrew Foss, Osmar R. Zaïane, Sandra Zilles
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872
http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.231.3872
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.231.3872 2023-05-15T17:53:44+02:00 Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking Andrew Foss Osmar R. Zaïane Sandra Zilles The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872 http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf text ftciteseerx 2016-01-07T18:45:42Z Abstract—This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the ‘rare classes ’ case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods. Keywords-Outlier Detection; Classification; Subspaces. I. Text Orca Unknown
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description Abstract—This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the ‘rare classes ’ case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods. Keywords-Outlier Detection; Classification; Subspaces. I.
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Andrew Foss
Osmar R. Zaïane
Sandra Zilles
spellingShingle Andrew Foss
Osmar R. Zaïane
Sandra Zilles
Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
author_facet Andrew Foss
Osmar R. Zaïane
Sandra Zilles
author_sort Andrew Foss
title Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
title_short Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
title_full Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
title_fullStr Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
title_full_unstemmed Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
title_sort unsupervised class separation of multivariate data through cumulative variance-based ranking
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872
http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf
genre Orca
genre_facet Orca
op_source http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3872
http://www2.cs.uregina.ca/%7Ezilles/fossZZ09.pdf
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766161434445611008