Impute missing values with prefered Model, sequentially, with hyperparametertuning and with PMM (if wanted) Need of 'helper_vimpute' script

vimpute(
  data,
  considered_variables = names(data),
  method = setNames(as.list(rep("ranger", length(considered_variables))),
    considered_variables),
  pmm = setNames(as.list(rep(TRUE, length(considered_variables))), considered_variables),
  formula = FALSE,
  sequential = TRUE,
  nseq = 10,
  eps = 0.005,
  imp_var = TRUE,
  pred_history = FALSE,
  tune = FALSE,
  verbose = FALSE
)

Arguments

data
  • Dataset with missing values. Can be provided as a data.table or data.frame.

considered_variables
  • A character vector of variable names to be either imputed or used as predictors, excluding irrelevant columns from the imputation process.

method
  • A named list specifying the imputation method for each variable:

pmm
  • TRUE/FALSE indicating whether predictive mean matching is used. Provide as a list for each variable.

formula
  • If not all variables are used as predictors, or if transformations or interactions are required (applies to all X, for Y only transformations are possible). Only applicable for the methods "robust" and "regularized". Provide as a list for each variable that requires specific conditions.

sequential
  • If TRUE, all variables are imputed sequentially.

nseq
  • Maximum number of iterations (if sequential is TRUE).

eps
  • Threshold for convergence.

imp_var
  • If TRUE, the imputed values are stored.

pred_history
  • If TRUE, all predicted values across all iterations are stored.

tune
  • Tunes hyperparameters halfway through iterations, TRUE or FALSE.

verbose
  • If TRUE additional debugging output is provided

Value

imputed data set or c(imputed data set, prediction history)

See also

Other imputation methods: hotdeck(), impPCA(), irmi(), kNN(), matchImpute(), medianSamp(), rangerImpute(), regressionImp(), sampleCat(), xgboostImpute()

Examples

if (FALSE) { # \dontrun{
x <- vimpute(data = sleep, sequential = FALSE)
y <- vimpute(data = sleep, sequential = TRUE, nseq = 3)
z <- vimpute(data = sleep, considered_variables =
       c("Sleep", "Dream", "Span", "BodyWgt"), sequential = FALSE)
} # }