Bias-corrected Bayesian Classification with Selected Features

Longhai Li, Department of Mathematics and Statistics, University of Saskatchewan

Copyright

Permission is granted for anyone to copy, use, modify, or distribute these programs and accompanying documents for any purpose, provided this copyright notice is retained and prominently displayed, and note is made of any changes made to these programs. These programs and documents are distributed without any warranty, express or implied. As the programs were written for research purposes only, they have not been tested to the degree that would be advisable in any important application. All use of these programs is entirely at the user's own risk.

Description

This software is used to predict the discrete class labels based on a selected subset of high-dimensional features, such as expression levels of genes. The data are modeled with a hierarchical Bayesian models using heavy-tailed t distributions as priors. When a large number of features are available, one may like to select only a subset of features to use, typically those features strongly correlated with the response in training cases. Such a feature selection procedure is however invalid since the relationship between the response and the features has be exaggerated by feature selection. This package provides a way to avoid this bias and yield better-calibrated predictions for future cases when one uses F-statistic to select features.

Source Packages and Documentations

References

The methods used in this software are discussed in details in the following papers:

Instruction of Installing an R package and Using R

Click here for instruction of installing an R package and using R.