Validation of Nearest Neighbor Classifiers

We develop a probabilistic bound on the error rate of the nearest neighbor classifier formed from a set of labelled examples. The bound is computed using only the examples in the set. A subset of the examples is used as a validation set to bound the error rate of the classifier formed from the remai...

Full description

Bibliographic Details
Main Author: Eric Bax
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 1998
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.1800
http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.44.1800
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.44.1800 2023-05-15T17:32:54+02:00 Validation of Nearest Neighbor Classifiers Eric Bax The Pennsylvania State University CiteSeerX Archives 1998 application/postscript http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.1800 http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.1800 http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps text 1998 ftciteseerx 2016-01-08T05:11:29Z We develop a probabilistic bound on the error rate of the nearest neighbor classifier formed from a set of labelled examples. The bound is computed using only the examples in the set. A subset of the examples is used as a validation set to bound the error rate of the classifier formed from the remaining examples. Then a bound is computed for the difference in error rates between the original classifier and the reduced classifier. This bound is computed by partitioning the validation set and using each subset to compute bounds for the error rate difference due to the other subsets. 1 Framework Consider the following machine learning framework. There is an unknown boolean-valued target function, and there is a distribution over the input space of the function. For example, the input distribution could consist of typical satellite images of the North Atlantic Ocean, and the target function could be 1 if the image contains a large iceberg and 0 otherwise. We have a set of in-sample data e. Text North Atlantic Unknown
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description We develop a probabilistic bound on the error rate of the nearest neighbor classifier formed from a set of labelled examples. The bound is computed using only the examples in the set. A subset of the examples is used as a validation set to bound the error rate of the classifier formed from the remaining examples. Then a bound is computed for the difference in error rates between the original classifier and the reduced classifier. This bound is computed by partitioning the validation set and using each subset to compute bounds for the error rate difference due to the other subsets. 1 Framework Consider the following machine learning framework. There is an unknown boolean-valued target function, and there is a distribution over the input space of the function. For example, the input distribution could consist of typical satellite images of the North Atlantic Ocean, and the target function could be 1 if the image contains a large iceberg and 0 otherwise. We have a set of in-sample data e.
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Eric Bax
spellingShingle Eric Bax
Validation of Nearest Neighbor Classifiers
author_facet Eric Bax
author_sort Eric Bax
title Validation of Nearest Neighbor Classifiers
title_short Validation of Nearest Neighbor Classifiers
title_full Validation of Nearest Neighbor Classifiers
title_fullStr Validation of Nearest Neighbor Classifiers
title_full_unstemmed Validation of Nearest Neighbor Classifiers
title_sort validation of nearest neighbor classifiers
publishDate 1998
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.1800
http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps
genre North Atlantic
genre_facet North Atlantic
op_source http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.1800
http://www.math.urich.edu/~ebax/papers/nearest_neighbor.ps
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766131223852220416