Impute NA
values with the logmean, mean, minimal or maximum reference value.
impute_df(x, limits, method = c("logmean", "mean", "min", "max"))
data.frame
, with the columns: "age", numeric
, "sex", factor
and more user defined numeric
columns that should be imputed.
data.frame
, reference table, has to have the columns:
"age", numeric
(same units as in age
, e.g. days or years, age of 0
matches all ages),
"sex", factor
(same levels for male and female as sex
and a special level
"both"
), "param", character
with the laboratory parameter name that have
to match the column name in x
, "lower" and "upper", numeric
for the
lower and upper reference limits.
character
, imputation method. method = "logmean"
(default)
replaces all NA
with its corresponding logged mean values for the reference
table limits
(for subsequent use of the zlog score,
use method = "mean" for *z* score calculation). For
method = "min"or
method = "max"` the lower or the upper limits are
used.
data.frame
, the same as x
but missing values are replaced by
the corresponding logmean, mean, minimal or maximal reference values
depending on the chosen method
.
Imputation should be done prior to z()
/zlog()
transformation.
Afterwards the NA
could replaced by zero (for mean-imputation) via
d[is.na(d)] <- 0
.
l <- data.frame(
param = c("alb", "bili"),
age = c(0, 0),
sex = c("both", "both"),
units = c("mg/l", "µmol/l"),
lower = c(35, 2),
upper = c(52, 21)
)
x <- data.frame(
age = 40:48,
sex = rep(c("female", "male"), c(5, 4)),
# from Hoffmann et al. 2017
alb = c(42, NA, 38, NA, 50, 42, 27, 31, 24),
bili = c(11, 9, NA, NA, 22, 42, NA, 200, 20)
)
impute_df(x, l)
#> age sex alb bili
#> 1 40 female 42.00000 11.000000
#> 2 41 female 42.66146 9.000000
#> 3 42 female 38.00000 6.480741
#> 4 43 female 42.66146 6.480741
#> 5 44 female 50.00000 22.000000
#> 6 45 male 42.00000 42.000000
#> 7 46 male 27.00000 6.480741
#> 8 47 male 31.00000 200.000000
#> 9 48 male 24.00000 20.000000
impute_df(x, l, method = "min")
#> age sex alb bili
#> 1 40 female 42 11
#> 2 41 female 35 9
#> 3 42 female 38 2
#> 4 43 female 35 2
#> 5 44 female 50 22
#> 6 45 male 42 42
#> 7 46 male 27 2
#> 8 47 male 31 200
#> 9 48 male 24 20
zlog_df(impute_df(x, l), l)
#> age sex alb bili
#> 1 40 female -0.1547222 0.8819855
#> 2 41 female 0.0000000 0.5474516
#> 3 42 female -1.1456903 0.0000000
#> 4 43 female 0.0000000 0.0000000
#> 5 44 female 1.5716234 2.0375165
#> 6 45 male -0.1547222 3.1154950
#> 7 46 male -4.5294925 0.0000000
#> 8 47 male -3.1616084 5.7172179
#> 9 48 male -5.6957115 1.8786269