This function trains linear logistic regression models with HMC in restricted Gibbs sampling.

htlr(
  X,
  y,
  fsel = 1:ncol(X),
  stdx = TRUE,
  prior = "t",
  df = 1,
  iter = 2000,
  warmup = floor(iter/2),
  thin = 1,
  init = "lasso",
  leap = 50,
  leap.warm = floor(leap/10),
  leap.stepsize = 0.3,
  cut = 0.05,
  verbose = FALSE,
  rep.legacy = FALSE,
  keep.warmup.hist = FALSE,
  ...
)

Arguments

X

Input matrix, of dimension nobs by nvars; each row is an observation vector.

y

Vector of response variables. Must be coded as non-negative integers, e.g., 1,2,...,C for C classes, label 0 is also allowed.

fsel

Subsets of features selected before fitting, such as by univariate screening.

stdx

Logical; if TRUE, the original feature values are standardized to have mean = 0 and sd = 1.

prior

The prior to be applied to the model. Either a list of hyperparameter settings returned by htlr_prior or a character string from "t" (student-t), "ghs" (horseshoe), and "neg" (normal-exponential-gamma).

df

The degree freedom of t/ghs/neg prior for coefficients. Will be ignored if the configuration list from htlr_prior is passed to prior.

iter

A positive integer specifying the number of iterations (including warmup).

warmup

A positive integer specifying the number of warmup (aka burnin). The number of warmup iterations should not be larger than iter and the default is iter / 2.

thin

A positive integer specifying the period for saving samples.

init

The initial state of Markov Chain; it accepts three forms:

  • a previously fitted fithtlr object,

  • a user supplied initial coeficient matrix of (p+1)*K, where p is the number of features, K is the number of classes in y minus 1,

  • a character string matches the following:

    • "lasso" - (Default) Use Lasso initial state with lambda chosen by cross-validation. Users may specify their own candidate lambda values via optional argument lasso.lambda. Further customized Lasso initial states can be generated by lasso_deltas.

    • "bcbc" - Use initial state generated by package BCBCSF (Bias-corrected Bayesian classification). Further customized BCBCSF initial states can be generated by bcbcsf_deltas. WARNING: This type of initial states can be used for continuous features such as gene expression profiles, but it should not be used for categorical features such as SNP profiles.

    • "random" - Use random initial values sampled from N(0, 1).

leap

The length of leapfrog trajectory in sampling phase.

leap.warm

The length of leapfrog trajectory in burnin phase.

leap.stepsize

The integrator step size used in the Hamiltonian simulation.

cut

The coefficients smaller than this criteria will be fixed in each HMC updating step.

verbose

Logical; setting it to TRUE for tracking MCMC sampling iterations.

rep.legacy

Logical; if TRUE, the output produced in HTLR versions up to legacy-3.1-1 is reproduced. The speed will be typically slower than non-legacy mode on multi-core machine. Default is FALSE.

keep.warmup.hist

Warmup iterations are not recorded by default, set TRUE to enable it.

...

Other optional parameters:

  • rda.alpha - A user supplied alpha value for bcbcsf_deltas. Default: 0.2.

  • lasso.lambda - A user supplied lambda sequence for lasso_deltas. Default: {.01, .02, ..., .05}. Will be ignored if rep.legacy is set to TRUE.

Value

An object with S3 class htlr.fit.

References

Longhai Li and Weixin Yao (2018). Fully Bayesian Logistic Regression with Hyper-Lasso Priors for High-dimensional Feature Selection. Journal of Statistical Computation and Simulation 2018, 88:14, 2827-2851.

Examples

set.seed(12345) data("colon") ## fit HTLR models with selected features, note that the chain length setting is for demo only ## using t prior with 1 df and log-scale fixed to -10 fit.t <- htlr(X = colon$X, y = colon$y, fsel = 1:100, prior = htlr_prior("t", df = 1, logw = -10), init = "bcbc", iter = 20, thin = 1) ## using NEG prior with 1 df and log-scale fixed to -10 fit.neg <- htlr(X = colon$X, y = colon$y, fsel = 1:100, prior = htlr_prior("neg", df = 1, logw = -10), init = "bcbc", iter = 20, thin = 1)
#> Warning: random logw currently only supports t prior
## using horseshoe prior with 1 df and auto-selected log-scale fit.ghs <- htlr(X = colon$X, y = colon$y, fsel = 1:100, prior = "ghs", df = 1, init = "bcbc", iter = 20, thin = 1)
#> Warning: random logw currently only supports t prior