[R] Using lapply in R data table
Frank S.
f_j_rod at hotmail.com
Mon Sep 26 17:28:03 CEST 2016
Dear all,
I have a R data table like this:
DT <- data.table(
id = rep(c(2, 5, 7), c(3, 2, 2)),
fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, 2)),
group = rep(c("A", "B", "A"), c(3, 2, 2)) )
I want to construct a new variable "exposure" defined as follows:
1) If "fini" earlier than 2006-01-01 --> "exposure" = 1
2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - "fini"
3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5
So the desired output would be the following data table:
id fini exposure group
1: 2 2005-04-20 1.00 A
2: 2 2005-04-20 1.00 A
3: 2 2005-04-20 1.00 A
4: 5 2006-02-19 0.87 B
5: 5 2006-02-19 0.87 B
6: 7 2006-10-08 0.50 A
7: 7 2006-10-08 0.50 A
I have tried:
DT <- DT[ , list(id, fini, exposure = 0, group)]
DT.new <- lapply(DT, function(exposure){
exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case
exposure[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, units="days")/365.25 # 2nd case
exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] <- 0.5 # 3rd case
exposure # return value
})
But I get an error message.
Thanks for any help!!
Frank S.
[[alternative HTML version deleted]]
More information about the R-help
mailing list