This function trains linear logistic regression models with HMC in restricted Gibbs sampling.
It also makes predictions for test cases if X_ts
are provided.
Usage
htlr_fit(
X_tr,
y_tr,
fsel = 1:ncol(X_tr),
stdzx = TRUE,
ptype = c("t", "ghs", "neg"),
sigmab0 = 2000,
alpha = 1,
s = -10,
eta = 0,
iters_h = 1000,
iters_rmc = 1000,
thin = 1,
leap_L = 50,
leap_L_h = 5,
leap_step = 0.3,
hmc_sgmcut = 0.05,
initial_state = "lasso",
keep.warmup.hist = FALSE,
silence = TRUE,
rep.legacy = TRUE,
alpha.rda = 0.2,
lasso.lambda = seq(0.05, 0.01, by = -0.01),
X_ts = NULL,
predburn = NULL,
predthin = 1
)
Arguments
- X_tr
Input matrix, of dimension nobs by nvars; each row is an observation vector.
- y_tr
Vector of response variables. Must be coded as non-negative integers, e.g., 1,2,...,C for C classes, label 0 is also allowed.
- fsel
Subsets of features selected before fitting, such as by univariate screening.
- stdzx
Logical; if
TRUE
, the original feature values are standardized to havemean = 0
andsd = 1
.- ptype
The prior to be applied to the model. Either "t" (student-t, default), "ghs" (horseshoe), or "neg" (normal-exponential-gamma).
- sigmab0
The
sd
of the normal prior for the intercept.- alpha
The degree freedom of t/ghs/neg prior for coefficients.
- s
The log scale of priors (logw) for coefficients.
- eta
The
sd
of the normal prior for logw. When it is set to 0, logw is fixed. Otherwise, logw is assigned with a normal prior and it will be updated during sampling.- iters_h
A positive integer specifying the number of warmup (aka burnin).
- iters_rmc
A positive integer specifying the number of iterations after warmup.
- thin
A positive integer specifying the period for saving samples.
- leap_L
The length of leapfrog trajectory in sampling phase.
- leap_L_h
The length of leapfrog trajectory in burnin phase.
- leap_step
The stepsize adjustment multiplied to the second-order partial derivatives of log posterior.
- hmc_sgmcut
The coefficients smaller than this criteria will be fixed in each HMC updating step.
- initial_state
The initial state of Markov Chain; can be a previously fitted
fithtlr
object, or a user supplied initial state vector, or a character string matches the following:"lasso" - (Default) Use Lasso initial state with
lambda
chosen by cross-validation. Users may specify their own candidatelambda
values via optional argumentlasso.lambda
. Further customized Lasso initial states can be generated bylasso_deltas
."bcbcsfrda" - Use initial state generated by package
BCBCSF
(Bias-corrected Bayesian classification). Further customized BCBCSF initial states can be generated bybcbcsf_deltas
. WARNING: This type of initial states can be used for continuous features such as gene expression profiles, but it should not be used for categorical features such as SNP profiles."random" - Use random initial values sampled from N(0, 1).
- keep.warmup.hist
Warmup iterations are not recorded by default, set
TRUE
to enable it.- silence
Setting it to
FALSE
for tracking MCMC sampling iterations.- rep.legacy
Logical; if
TRUE
, the output produced inHTLR
versions up to legacy-3.1-1 is reproduced. The speed would be typically slower than non-legacy mode on multi-core machine.- alpha.rda
A user supplied alpha value for
bcbcsf_deltas
when setting up BCBCSF initial state. Default: 0.2.- lasso.lambda
- A user supplied lambda sequence for
lasso_deltas
when setting up Lasso initial state. Default: {.01, .02, ..., .05}. Will be ignored ifrep.legacy
is set toTRUE
.- X_ts
Test data which predictions are to be made.
- predburn, predthin
For prediction base on
X_ts
(when supplied),predburn
of Markov chain (super)iterations will be discarded, and only everypredthin
are used for inference.