Impute NA values with the logmean, mean, minimal or maximum reference value.
impute_df(x, limits, method = c("logmean", "mean", "min", "max"))data.frame, with the columns: "age", numeric, "sex", factor
and more user defined numeric columns that should be imputed.
data.frame, reference table, has to have the columns:
"age", numeric (same units as in age, e.g. days or years, age of 0
matches all ages),
"sex", factor (same levels for male and female as sex and a special level
"both"), "param", character with the laboratory parameter name that have
to match the column name in x, "lower" and "upper", numeric for the
lower and upper reference limits.
character, imputation method. method = "logmean" (default)
replaces all NA with its corresponding logged mean values for the reference
table limits (for subsequent use of the zlog score,
use method = "mean" for *z* score calculation). For method = "min"ormethod = "max"` the lower or the upper limits are
used.
data.frame, the same as x but missing values are replaced by
the corresponding logmean, mean, minimal or maximal reference values
depending on the chosen method.
Imputation should be done prior to z()/zlog() transformation.
Afterwards the NA could replaced by zero (for mean-imputation) via
d[is.na(d)] <- 0.
l <- data.frame(
param = c("alb", "bili"),
age = c(0, 0),
sex = c("both", "both"),
units = c("mg/l", "µmol/l"),
lower = c(35, 2),
upper = c(52, 21)
)
x <- data.frame(
age = 40:48,
sex = rep(c("female", "male"), c(5, 4)),
# from Hoffmann et al. 2017
alb = c(42, NA, 38, NA, 50, 42, 27, 31, 24),
bili = c(11, 9, NA, NA, 22, 42, NA, 200, 20)
)
impute_df(x, l)
#> age sex alb bili
#> 1 40 female 42.00000 11.000000
#> 2 41 female 42.66146 9.000000
#> 3 42 female 38.00000 6.480741
#> 4 43 female 42.66146 6.480741
#> 5 44 female 50.00000 22.000000
#> 6 45 male 42.00000 42.000000
#> 7 46 male 27.00000 6.480741
#> 8 47 male 31.00000 200.000000
#> 9 48 male 24.00000 20.000000
impute_df(x, l, method = "min")
#> age sex alb bili
#> 1 40 female 42 11
#> 2 41 female 35 9
#> 3 42 female 38 2
#> 4 43 female 35 2
#> 5 44 female 50 22
#> 6 45 male 42 42
#> 7 46 male 27 2
#> 8 47 male 31 200
#> 9 48 male 24 20
zlog_df(impute_df(x, l), l)
#> age sex alb bili
#> 1 40 female -0.1547222 0.8819855
#> 2 41 female 0.0000000 0.5474516
#> 3 42 female -1.1456903 0.0000000
#> 4 43 female 0.0000000 0.0000000
#> 5 44 female 1.5716234 2.0375165
#> 6 45 male -0.1547222 3.1154950
#> 7 46 male -4.5294925 0.0000000
#> 8 47 male -3.1616084 5.7172179
#> 9 48 male -5.6957115 1.8786269