[R] Using lapply in R data table

Ista Zahn istazahn at gmail.com
Mon Sep 26 20:37:59 CEST 2016


On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
> This seems like a job for cut() .

I thought that at first two, but the middle group shouldn't be .87 but rather

exposure" = "2007-01-01" - "fini"

so, I think cut alone won't do it.

Best,
Ista
>
> (I made DT a data frame to avoid loading the data table package. But I
> assume it would work with a data table too, Check this, though!)
>
>> DT <- within(DT, exposure <- cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), labels= c(1,.87,.5)))
>
>> DT
>   id       fini group exposure
> 1  2 2005-04-20     A        1
> 2  2 2005-04-20     A        1
> 3  2 2005-04-20     A        1
> 4  5 2006-02-19     B     0.87
> 5  5 2006-02-19     B     0.87
> 6  7 2006-10-08     A      0.5
> 7  7 2006-10-08     A      0.5
>
>
> (but note that exposure is a factor, not numeric)
>
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn <istazahn at gmail.com> wrote:
>> Hi Frank,
>>
>> lapply(DT) iterates over each column. That doesn't seem to be what you want.
>>
>> There are probably better ways, but here is one approach.
>>
>> DT[, exposure := vector(mode = "numeric", length = .N)]
>> DT[fini < as.Date("2006-01-01"), exposure := 1]
>> DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"),
>>       exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25]
>> DT[fini >= as.Date("2006-07-01"), exposure := 0.5]
>>
>> Best,
>> Ista
>>
>> On Mon, Sep 26, 2016 at 11:28 AM, Frank S. <f_j_rod at hotmail.com> wrote:
>>> Dear all,
>>>
>>> I have a R data table like this:
>>>
>>> DT <- data.table(
>>>   id = rep(c(2, 5, 7), c(3, 2, 2)),
>>>   fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, 2)),
>>>   group = rep(c("A", "B", "A"), c(3, 2, 2))  )
>>>
>>>
>>> I want to construct a new variable "exposure" defined as follows:
>>>
>>> 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1
>>> 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - "fini"
>>> 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5
>>>
>>>
>>> So the desired output would be the following data table:
>>>
>>>    id                fini exposure group
>>> 1:  2 2005-04-20        1.00        A
>>> 2:  2 2005-04-20        1.00        A
>>> 3:  2 2005-04-20        1.00        A
>>> 4:  5 2006-02-19        0.87        B
>>> 5:  5 2006-02-19        0.87        B
>>> 6:  7 2006-10-08        0.50        A
>>> 7:  7 2006-10-08        0.50        A
>>>
>>>
>>> I have tried:
>>>
>>> DT <- DT[ , list(id, fini, exposure = 0, group)]
>>> DT.new <- lapply(DT, function(exposure){
>>>       exposure[fini < as.Date("2006-01-01")] <- 1   # 1st case
>>>       exposure[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, units="days")/365.25 # 2nd case
>>>     exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] <- 0.5       # 3rd case
>>>       exposure  # return value
>>>   })
>>>
>>>
>>> But I get an error message.
>>>
>>> Thanks for any help!!
>>>
>>>
>>> Frank S.
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list