Generating a new houshold ID for survey data using a houshold ID and a personal ID. For surveys with rotating panel design containing housholds, houshold members can move from an existing household to a new one, that was not originally in the sample. This leads to the creation of so called split households. Using a peronal ID (that stays fixed over the whole survey), an indicator for different time steps and a houshold ID, a new houshold ID is assigned to the original and the split household.

generate.HHID(dat, period = "RB010", pid = "RB030", hid = "DB030")



data table of data frame containing the survey data


column name of dat containing an indicator for the rotations, e.g years, quarters, months, ect...


column name of dat containing the personal identifier. This needs to be fixed for an indiviual throught the whole survey


column name of dat containing the household id. This needs to for a household throught the whole survey


the survey data dat as data.table object containing a new and an old household ID. The new household ID which considers the split households is now named hid and the original household ID has a trailing "_orig".


if (FALSE) { library(surveysd) library(laeken) library(data.table) eusilc <- surveysd:::demo.eusilc(n=4) # create spit households eusilc[,rb030split:=rb030] year <- eusilc[,unique(year)] year <- year[-1] leaf_out <- c() for(y in year) { split.person <- eusilc[year==(y-1)&!duplicated(db030)&!db030%in%leaf_out, sample(rb030,20)] overwrite.person <- eusilc[year==(y)&!duplicated(db030)&!db030%in%leaf_out, .(rb030=sample(rb030,20))] overwrite.person[,c("rb030split","year_curr"):=.(split.person,y)] eusilc[overwrite.person, rb030split:=i.rb030split,on=.(rb030,year>=year_curr)] leaf_out <- c( leaf_out, eusilc[rb030%in%c(overwrite.person$rb030,overwrite.person$rb030split), unique(db030)]) } # pid which are in split households eusilc[,.(uniqueN(db030)),by=list(rb030split)][V1>1] <- generate.HHID(eusilc, period = "year", pid = "rb030split", hid = "db030") # no longer any split households in the data[,.(uniqueN(db030)),by=list(rb030split)][V1>1] }