ModelInfo_mf.Rd
This function allows users to rely on the powerful caret
package for
cross-validating and tuning a MetaForest analysis. Methods for MetaForest
are not included in the caret package, because the interface of caret is not
entirely compatible with MetaForest's model call. Specifically, MetaForest is
not compatible with the train
methods for classes 'formula' or
'recipe', because the variance of the effect size must be a column of the
training data x. The name of this column is specified using the argument
'vi'.
ModelInfo_mf()
ModelInfo list of length 17.
To train a clustered MetaForest, for nested data structures, simply provide the optional argument 'study' to the train function, to specify the study ID. This should again refer to a column of x.
When training a clustered MetaForest, make sure to use 'index = groupKFold(your_study_id_variable, k = 10))' in traincontrol, to sample by study ID when creating cross-validation partitions; otherwise the testing error will be positively biased.
if (FALSE) {
# Prepare data
data <- dat.bangertdrowns2004
data[, c(4:12)] <- apply(data[ , c(4:12)], 2, function(x){
x[is.na(x)] <- median(x, na.rm = TRUE)
x})
data$subject <- factor(data$subject)
data$yi <- as.numeric(data$yi)
# Load caret
library(caret)
set.seed(999)
# Specify the resampling method as 10-fold CV
fit_control <- trainControl(method = "cv", number = 10)
cv_mf_fit <- train(y = data$yi, x = data[,c(3:13, 16)],
method = ModelInfo_mf(), trControl = fit_control)
# Cross-validated clustered MetaForest
data <- get(data(dat.bourassa1996))
data <- escalc(measure = "OR", ai = lh.le, bi = lh.re, ci = rh.le, di= rh.re,
data = data, add = 1/2, to = "all")
data$mage[is.na(data$mage)] <- median(data$mage, na.rm = TRUE)
data[c(5:8)] <- lapply(data[c(5:8)], factor)
data$yi <- as.numeric(data$yi)
# Set up 10-fold grouped CV
fit_control <- trainControl(method = "cv", index = groupKFold(data$sample,
k = 10))
# Set up a custom tuning grid for the three tuning parameters of MetaForest
rf_grid <- expand.grid(whichweights = c("random", "fixed", "unif"),
mtry = c(2, 4, 6),
min.node.size = c(2, 4, 6))
# Train the model
cv.mf.cluster <- train(y = data$yi, x = data[, c("selection", "investigator",
"hand_assess", "eye_assess",
"mage", "sex", "vi",
"sample")],
study = "sample", method = ModelInfo_mf(),
trControl = fit_control,
tuneGrid = rf_grid)
}