[R] store list objects in data.table

Naresh Gurbuxani n@re@h_gurbux@n| @end|ng |rom hotm@||@com
Sun Sep 22 13:44:16 CEST 2024


Thanks everyone for their responses.

My data is organized in a data.table.  My goal is to perform analyses 
according to some groups.  The results of analysis are objects.  If 
these objects could be stored as elements of a data.table, this would 
help downstream summarizing of results.

Let me try another example.

carsdt <- setDT(copy(mtcars))

carsdt[, unique(cyl) |> length()]
#[1] 3

carsreg <- carsdt[, .(fit = lm(mpg ~ disp + hp + wt)), by = .(cyl)]

#I would like a data.table with three rows, one each for "lm" object 
corresponding to cyl value

carsreg[, .N]
#[1] 36

#Here each component of "lm" object is stored in a separate row.

carsreg[1]
#     cyl                                             fit
#   <num> <lm>
#1:     6 30.27790680, 0.01610061,-0.01097072,-3.89618307

lm(mpg ~ disp + hp + wt, data = mtcars, subset = (cyl == 6)) |> coef()
#(Intercept)        disp          hp          wt
#30.27790680  0.01610061 -0.01097072 -3.89618307

A less satisfactory solution is to extract desired components and store 
them in data.table.  But this requires multiple calls to lm().

carsreg2 <- carsdt[, .(coef = list(coef(lm(mpg ~ disp + hp + wt))), rsq 
= summary(lm(mpg ~ disp + hp + wt))$r.squared), by = .(cyl)]

Now if I want to also include F-statistic, it would require an 
additional call to lm() and adding a column to above data.table.  Is 
there a way to avoid this?

Naresh

On 9/22/24 2:00 AM, Bert Gunter wrote:
> Well, you may have good reasons to do things this way -- and you
> certainly do not have to explain them here.
>
> But you might wish to consider using R's poly() function and a basic
> nested list structure to do something quite similar that seems much
> simpler to me, anyway:
>
> x <- rnorm(20)
> df <- data.frame(x = x, y = x + .1*x^2 + rnorm(20, sd = .2))
> result <-
>     with(df,
>            lapply(1:2, \(i)
>                   list(
>                       degree = i, reg =lm(y ~ poly(x, i, raw = TRUE))
>                      )
>            )
>     )
>
> As you can see, 'result' is a list, each component of which is a list
> of two with names "degree" and "reg" giving the same info as each row
> of your 'mydt'. You can use lapply() and friends to access these
> results and fiddle with them as you like, such as: "extract the
> coefficients from the second degree fits only", and so forth. Also
> note that individual components of nested lists can be extracted by
> giving a vector to [[ instead of repeated [['s. For example:
> result[[2]][[2]]  ## the reg component of the degree 2 polynomial
> ## is the same as
> result[[c(2,2)]] ## this is a bit easier for me to groc.
>
> Again, feel free to ignore without replying if my gratuitous remarks
> are unhelpful.
>
> Cheers,
> Bert
>
>
> On Sat, Sep 21, 2024 at 2:25 PM Naresh Gurbuxani
> <naresh_gurbuxani using hotmail.com> wrote:
>> I am trying to store regression objects in a data.table
>>
>> df <- data.frame(x = rnorm(20))
>> df[, "y"] <- with(df, x + 0.1 * x^2 + 0.2 * rnorm(20))
>>
>> mydt <- data.table(mypower = c(1, 2), myreg = list(lm(y ~ x, data = df),
>> lm(y ~ x + I(x^2), data = df)))
>>
>> mydt
>> #   mypower    myreg
>> #     <num>   <list>
>> #1:       1 <lm[12]>
>> #2:       2 <lm[12]>
>>
>> But mydt[1, 2] has only the coeffients of the first regression. mydt[2,
>> 2] has residuals of the first regression.  These are the first two
>> components of "lm" object.
>>
>> mydt[1, myreg[[1]]]
>> #(Intercept)           x
>> #   0.107245    1.034110
>>
>> Is there a way to put full "lm" object in each row?
>>
>> Thanks,
>> Naresh
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list