Similarity of samples and trimming
We say that two probabilities are similar at level a if they are contaminated versions (up to an a fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in t...
Published in: | Bernoulli |
---|---|
Main Authors: | , , , |
Other Authors: | |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
International Statistical Institute; Chapman and Hall
2012
|
Subjects: | |
Online Access: | https://hdl.handle.net/10902/29685 https://doi.org/10.3150/11-BEJ351 |
Summary: | We say that two probabilities are similar at level a if they are contaminated versions (up to an a fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples. Research partially supported by the Spanish Ministerio de Ciencia e Innovación, Grant MTM2008-06067-C02-01, and 02 and by the Consejería de Educación y Cultura de la Junta de Castilla y León, GR150. The authors would like to thank two anonymous referees for their careful reading of the manuscript, their suggestions and the pointers to relevant references that helped us to greatly improve our original version. |
---|