A comparative analysis of feature selection methods for biomarker discovery in study of toxicant-treated atlantic cod (Gadus morhua) liver

Biomarker discovery is extraordinarily important in gene expression analysis in context of toxicant exposure. Among gene selection methods, differential expression analysis is often applied because of its simplicity and interpretability. But it treats genes individually, disregarding the correlation...

Full description

Bibliographic Details
Main Authors: Xiaokang Zhang, Jonassen, Inge
Format: Article in Journal/Newspaper
Language:unknown
Published: F1000Research 2017
Subjects:
Online Access:https://dx.doi.org/10.7490/f1000research.1114608.1
https://f1000research.com/posters/6-1359
Description
Summary:Biomarker discovery is extraordinarily important in gene expression analysis in context of toxicant exposure. Among gene selection methods, differential expression analysis is often applied because of its simplicity and interpretability. But it treats genes individually, disregarding the correlation between them. So some multivariate feature selection methods are proposed for biomarker discovery. We compared three methods that stem from different theories, namely Significance Analysis of Microarrays (SAM) which finds out the differentially expressed genes, minimum Redundancy Maximum Relevance (mRMR) based on information theory, and Characteristic Direction (GeoDE) from a geometrical aspect, according to the stability and classification accuracy. The stability of feature selection methods is measured based on the overlap of selected features from different sampling steps. Using the subsets of selected features from 3 feature selection methods, we trained 4 classifiers, namely Random Forest, Support Vector Machine, RIDGE regression, LASSO, and then test the prediction accuracy to see how well the subsets can improve it. Based on these two aspects, we studied the performance of 3 feature selection methods. Tested on the gene expression data from two toxicant exposure experiments on Atlantic Cod liver, we found that GeoDE is more stable, and can give higher prediction accuracy in low-dose condition.