[R] store list objects in data.table
Naresh Gurbuxani
n@re@h_gurbux@n| @end|ng |rom hotm@||@com
Sun Sep 22 13:44:16 CEST 2024
Thanks everyone for their responses.
My data is organized in a data.table. My goal is to perform analyses
according to some groups. The results of analysis are objects. If
these objects could be stored as elements of a data.table, this would
help downstream summarizing of results.
Let me try another example.
carsdt <- setDT(copy(mtcars))
carsdt[, unique(cyl) |> length()]
#[1] 3
carsreg <- carsdt[, .(fit = lm(mpg ~ disp + hp + wt)), by = .(cyl)]
#I would like a data.table with three rows, one each for "lm" object
corresponding to cyl value
carsreg[, .N]
#[1] 36
#Here each component of "lm" object is stored in a separate row.
carsreg[1]
# cyl fit
# <num> <lm>
#1: 6 30.27790680, 0.01610061,-0.01097072,-3.89618307
lm(mpg ~ disp + hp + wt, data = mtcars, subset = (cyl == 6)) |> coef()
#(Intercept) disp hp wt
#30.27790680 0.01610061 -0.01097072 -3.89618307
A less satisfactory solution is to extract desired components and store
them in data.table. But this requires multiple calls to lm().
carsreg2 <- carsdt[, .(coef = list(coef(lm(mpg ~ disp + hp + wt))), rsq
= summary(lm(mpg ~ disp + hp + wt))$r.squared), by = .(cyl)]
Now if I want to also include F-statistic, it would require an
additional call to lm() and adding a column to above data.table. Is
there a way to avoid this?
Naresh
On 9/22/24 2:00 AM, Bert Gunter wrote:
> Well, you may have good reasons to do things this way -- and you
> certainly do not have to explain them here.
>
> But you might wish to consider using R's poly() function and a basic
> nested list structure to do something quite similar that seems much
> simpler to me, anyway:
>
> x <- rnorm(20)
> df <- data.frame(x = x, y = x + .1*x^2 + rnorm(20, sd = .2))
> result <-
> with(df,
> lapply(1:2, \(i)
> list(
> degree = i, reg =lm(y ~ poly(x, i, raw = TRUE))
> )
> )
> )
>
> As you can see, 'result' is a list, each component of which is a list
> of two with names "degree" and "reg" giving the same info as each row
> of your 'mydt'. You can use lapply() and friends to access these
> results and fiddle with them as you like, such as: "extract the
> coefficients from the second degree fits only", and so forth. Also
> note that individual components of nested lists can be extracted by
> giving a vector to [[ instead of repeated [['s. For example:
> result[[2]][[2]] ## the reg component of the degree 2 polynomial
> ## is the same as
> result[[c(2,2)]] ## this is a bit easier for me to groc.
>
> Again, feel free to ignore without replying if my gratuitous remarks
> are unhelpful.
>
> Cheers,
> Bert
>
>
> On Sat, Sep 21, 2024 at 2:25 PM Naresh Gurbuxani
> <naresh_gurbuxani using hotmail.com> wrote:
>> I am trying to store regression objects in a data.table
>>
>> df <- data.frame(x = rnorm(20))
>> df[, "y"] <- with(df, x + 0.1 * x^2 + 0.2 * rnorm(20))
>>
>> mydt <- data.table(mypower = c(1, 2), myreg = list(lm(y ~ x, data = df),
>> lm(y ~ x + I(x^2), data = df)))
>>
>> mydt
>> # mypower myreg
>> # <num> <list>
>> #1: 1 <lm[12]>
>> #2: 2 <lm[12]>
>>
>> But mydt[1, 2] has only the coeffients of the first regression. mydt[2,
>> 2] has residuals of the first regression. These are the first two
>> components of "lm" object.
>>
>> mydt[1, myreg[[1]]]
>> #(Intercept) x
>> # 0.107245 1.034110
>>
>> Is there a way to put full "lm" object in each row?
>>
>> Thanks,
>> Naresh
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list