This function trains linear logistic regression models with HMC in restricted Gibbs sampling.
Arguments
- X
Input matrix, of dimension nobs by nvars; each row is an observation vector.
- y
Vector of response variables. Must be coded as non-negative integers, e.g., 1,2,...,C for C classes, label 0 is also allowed.
- fsel
Subsets of features selected before fitting, such as by univariate screening.
- stdx
Logical; if
TRUE
, the original feature values are standardized to havemean = 0
andsd = 1
.- prior
The prior to be applied to the model. Either a list of hyperparameter settings returned by
htlr_prior
or a character string from "t" (student-t), "ghs" (horseshoe), and "neg" (normal-exponential-gamma).- df
The degree freedom of t/ghs/neg prior for coefficients. Will be ignored if the configuration list from
htlr_prior
is passed toprior
.- iter
A positive integer specifying the number of iterations (including warmup).
- warmup
A positive integer specifying the number of warmup (aka burnin). The number of warmup iterations should not be larger than iter and the default is
iter / 2
.- thin
A positive integer specifying the period for saving samples.
- init
The initial state of Markov Chain; it accepts three forms:
a previously fitted
fithtlr
object,a user supplied initial coeficient matrix of (p+1)*K, where p is the number of features, K is the number of classes in y minus 1,
a character string matches the following:
"lasso" - (Default) Use Lasso initial state with
lambda
chosen by cross-validation. Users may specify their own candidatelambda
values via optional argumentlasso.lambda
. Further customized Lasso initial states can be generated bylasso_deltas
."bcbc" - Use initial state generated by package
BCBCSF
(Bias-corrected Bayesian classification). Further customized BCBCSF initial states can be generated bybcbcsf_deltas
. WARNING: This type of initial states can be used for continuous features such as gene expression profiles, but it should not be used for categorical features such as SNP profiles."random" - Use random initial values sampled from N(0, 1).
- leap
The length of leapfrog trajectory in sampling phase.
- leap.warm
The length of leapfrog trajectory in burnin phase.
- leap.stepsize
The integrator step size used in the Hamiltonian simulation.
- cut
The coefficients smaller than this criteria will be fixed in each HMC updating step.
- verbose
Logical; setting it to
TRUE
for tracking MCMC sampling iterations.- rep.legacy
Logical; if
TRUE
, the output produced inHTLR
versions up to legacy-3.1-1 is reproduced. The speed will be typically slower than non-legacy mode on multi-core machine. Default isFALSE
.- keep.warmup.hist
Warmup iterations are not recorded by default, set
TRUE
to enable it.- ...
Other optional parameters:
rda.alpha - A user supplied alpha value for
bcbcsf_deltas
. Default: 0.2.lasso.lambda - A user supplied lambda sequence for
lasso_deltas
. Default: {.01, .02, ..., .05}. Will be ignored ifrep.legacy
is set toTRUE
.
References
Longhai Li and Weixin Yao (2018). Fully Bayesian Logistic Regression with Hyper-Lasso Priors for High-dimensional Feature Selection. Journal of Statistical Computation and Simulation 2018, 88:14, 2827-2851.
Examples
set.seed(12345)
data("colon")
## fit HTLR models with selected features, note that the chain length setting is for demo only
## using t prior with 1 df and log-scale fixed to -10
fit.t <- htlr(X = colon$X, y = colon$y, fsel = 1:100,
prior = htlr_prior("t", df = 1, logw = -10),
init = "bcbc", iter = 20, thin = 1)
## using NEG prior with 1 df and log-scale fixed to -10
fit.neg <- htlr(X = colon$X, y = colon$y, fsel = 1:100,
prior = htlr_prior("neg", df = 1, logw = -10),
init = "bcbc", iter = 20, thin = 1)
#> Warning: random logw currently only supports t prior
## using horseshoe prior with 1 df and auto-selected log-scale
fit.ghs <- htlr(X = colon$X, y = colon$y, fsel = 1:100,
prior = "ghs", df = 1, init = "bcbc",
iter = 20, thin = 1)
#> Warning: random logw currently only supports t prior