C++ routines to invoke a single iteration of the Iterative proportional updating (IPU) scheme. Targets and classes
are assumed to be one dimensional in the ipf_step
functions. combine_factors
aggregates several vectors of
type factor into a single one to allow multidimensional ipu-steps. See examples.
Usage
ipf_step_ref(w, classes, targets)
ipf_step(w, classes, targets)
ipf_step_f(w, classes, targets)
combine_factors(dat, targets)
Arguments
- w
a numeric vector of weights. All entries should be positive.
- classes
a factor variable. Must have the same length as
w
.- targets
key figure to target with the ipu scheme. A numeric verctor of the same length as
levels(classes)
. This can also be atable
produced byxtabs
. See examples.- dat
a
data.frame
containing the factor variables to be combined.
Details
ipf_step
returns the adjusted weights. ipf_step_ref
does the same, but updates w
by reference rather than
returning. ipf_step_f
returns a multiplicator: adjusted weights divided by unadjusted weights. combine_factors
is
designed to make ipf_step
work with contingency tables produced by xtabs.
Examples
############# one-dimensional ipu ##############
## create random data
nobs <- 10
classLabels <- letters[1:3]
dat = data.frame(
weight = exp(rnorm(nobs)),
household = factor(sample(classLabels, nobs, replace = TRUE))
)
dat
#> weight household
#> 1 4.5053244 b
#> 2 0.2132274 c
#> 3 1.7717582 c
#> 4 0.8959820 c
#> 5 2.3290412 c
#> 6 6.5691852 a
#> 7 1.8573559 c
#> 8 0.8195709 c
#> 9 1.5995570 c
#> 10 0.3538990 b
## create targets (same lenght as classLabels!)
targets <- 3:5
## calculate weights
new_weight <- ipf_step(dat$weight, dat$household, targets)
cbind(dat, new_weight)
#> weight household new_weight
#> 1 4.5053244 b 3.7086786
#> 2 0.2132274 c 0.1123848
#> 3 1.7717582 c 0.9338321
#> 4 0.8959820 c 0.4722409
#> 5 2.3290412 c 1.2275565
#> 6 6.5691852 a 3.0000000
#> 7 1.8573559 c 0.9789476
#> 8 0.8195709 c 0.4319673
#> 9 1.5995570 c 0.8430708
#> 10 0.3538990 b 0.2913214
## check solution
xtabs(new_weight ~ dat$household)
#> dat$household
#> a b c
#> 3 4 5
## calculate weights "by reference"
ipf_step_ref(dat$weight, dat$household, targets)
dat
#> weight household
#> 1 3.7086786 b
#> 2 0.1123848 c
#> 3 0.9338321 c
#> 4 0.4722409 c
#> 5 1.2275565 c
#> 6 3.0000000 a
#> 7 0.9789476 c
#> 8 0.4319673 c
#> 9 0.8430708 c
#> 10 0.2913214 b
############# multidimensional ipu ##############
## load data
factors <- c("time", "sex", "smoker", "day")
tips <- data.frame(sex=c("Female","Male","Male"), day=c("Sun","Mon","Tue"),
time=c("Dinner","Lunch","Lunch"), smoker=c("No","Yes","No"))
tips <- tips[factors]
## combine factors
con <- xtabs(~., tips)
cf <- combine_factors(tips, con)
cbind(tips, cf)[sample(nrow(tips), 10, replace = TRUE),]
#> time sex smoker day cf
#> 2 Lunch Male Yes Mon 8
#> 2.1 Lunch Male Yes Mon 8
#> 3 Lunch Male No Tue 20
#> 2.2 Lunch Male Yes Mon 8
#> 2.3 Lunch Male Yes Mon 8
#> 3.1 Lunch Male No Tue 20
#> 1 Dinner Female No Sun 9
#> 3.2 Lunch Male No Tue 20
#> 2.4 Lunch Male Yes Mon 8
#> 1.1 Dinner Female No Sun 9
## adjust weights
weight <- rnorm(nrow(tips)) + 5
adjusted_weight <- ipf_step(weight, cf, con)
## check outputs
con2 <- xtabs(adjusted_weight ~ ., data = tips)
sum((con - con2)^2)
#> [1] 0