setup_model_MegaLMM.RdSets up the MegaLMM model, selects starting values, and pre-calculates matrices for the Gibbs sampler.
setup_model_MegaLMM(
Y,
formula,
extra_regressions = NULL,
data,
relmat = NULL,
cis_genotypes = NULL,
Lambda_fixed = NULL,
run_parameters = MegaLMM_control(),
posteriorSample_params = c("Lambda", "U_F", "F", "delta", "tot_F_prec", "F_h2",
"tot_Eta_prec", "resid_h2", "B1", "B2_F", "B2_R", "U_R", "cis_effects",
"Lambda_m_eff", "Lambda_pi", "B2_R_pi", "B2_F_pi"),
posteriorMean_params = c(),
posteriorFunctions = list(),
run_ID = "MegaLMM_run"
)either a) a n x p matrix of data (n individuals x p traits), or b) a list describing
the observation_model, data, and associated parameters. This list should contain:
i) observation_model: a function modeled after missing_observation_model
(see code by typing this into the console) that draws posterior samples of Eta conditional on
the observations (Y) and the current state of the MegaLMM model (current_state).
The function should have the same form as missing_data,
and must return Eta even if current_state is NULL.
ii) observations: a data.frame containing the observaition-level data and associated covariates.
This must include a column ID that is also present in data
iii) any other parameters necessary for observation_model
RHS of a model. The syntax is similar to lmer.
Random effects are specified by (1+factor | group), with the left side of the '|' a design
matrix, and the right side a random effect factor (group). For each random effect factor, a
covariance matrix (K_mats) or precision matrix (K_inv_mats) can be provided.
Unlike in lmer, each variable or covariate in the design matrix is given an
independent random effect (ie no covariance among random effects is modeled), so two bars '||'
gives an identical model to one bar.
Note: the speed of the model will decrease dramatically with the number of random effects (multiplied
by h2_divisions).
Fixed effects only apply to the model residuals (ie X1) below, and are not regularized (prior precision = 0)
Optional. A list including either:
i) the matrix X (n x b) of regression coeffients, or
ii) two matrices U (n x m) and V (m x b) such that X = U*V
also, logical variables resids and fixed specify whether these coefficients apply to either or
both of the model residuals or the factors
data.frame with n rows containing columns corresponding to the fixed and random effects
Optional. A list of covariance matrices for random effects. If none provided for any of the random effects, K is assumed to be the identity.
Optional. A list of n x ci matrices of length p giving cis-effect coefficients for each trait
Optional. A matrix of the first k rows of Lambda that are fixed
See MegaLMM_control
A character vector giving names of parameters to save all posterior samples
A character vector giving names of parameters to save only the posterior mean.
A list of named and quoted functions that that will calculate statistics based on the current_state.
The functions can use variables in current_state, data_matrices, priors, or in the user's environment.
A unique identifier for this model. The code will create a folder with this name to hold all posterior samples and diagnostic information during the run.
An object of class MegaLMM_state with components:
current_state: a list of parameters in the current iteration of the sampler. Initially empty
Posterior: a list of arrays of posterior samples. Initially empty
RNG: current state of R's Random number generator (for re-starting chaings)
traitnames: vector of trait names (from colnames of Y)
run_parameters, run_variables, data_matrices, priors: input data and parameters
The first step in fitting a MegaLMM model. This function sets up the model matrices based on the fixed and random effect formulas provided. This function must be followed by calls to set_priors_MegaLMM, initialize_variables_MegaLMM and initialize_MegaLMM, before the Gibbs sampler can be run with sample_MegaLMM.
The model is specified as:
y_i = g(eta_i)
Eta = rbind(eta_1,...,eta_n) = X1*B1 + X2_R*B2_R + F*Lambda + Z*U_R + E_R
F = X2_F * B2_F + Z*U_F + E_F
For sampling, we reparameterize as:
Qt*Eta = Qt*X*B + Qt*F*Lambda + Qt*ZL*U_R + Qt*E_R
Qt*F = Qt*X_F * B_F + Qt*ZL*U_F + Qt*E_F
where LTL = K and ZL = Z*L
We sample the quantities Qt*Eta, Qt*F, B, Lambda, U_R, U_F. We then back-calculate Eta and F.
Note: In the original MegaLMM paper, we set y_i = eta_i, so replaced Eta with Y above.