[R] Forcing results from lm into datframe

David Winsemius dwinsemius at comcast.net
Tue Oct 26 19:48:24 CEST 2010


On Oct 26, 2010, at 10:22 AM, Small Sandy (NHS Greater Glasgow &  
Clyde) wrote:

> Thanks David
> That's great
>
> As a matter of interest, to get a data frame by studies why do you  
> have to do
>
> fitsdf <- as.data.frame(t(as.data.frame(fits)))

The apply family of functions often return results rotated from what  
new users expect. I am not really sure why that is so in this case,  
but if you wanted to trace it out, you could look at the code, but I  
just looked at as.data.frame.list  (since there are over 20  
as.data.frame methods) and the answer is not immediately apparent to  
me. I thought maybe I sould see a cbind() call in there, and I suppose  
this section .... as.call(c(expression(data.frame), x ... may have  
that effect

-- 
David.

>
> Why doesn't
> fitsdf <- as.data.frame(t(fits))
> work?
>
> Sandy Small
>
> ________________________________________
> From: David Winsemius [dwinsemius at comcast.net]
> Sent: 26 October 2010 16:37
> To: Small Sandy (NHS Greater Glasgow & Clyde)
> Cc: r-help at r-project.org
> Subject: Re: [R] Forcing results from lm into datframe
>
> On Oct 26, 2010, at 8:08 AM, Small Sandy (NHS Greater Glasgow & Clyde)
> wrote:
>
>> Hi
>>
>> I need some help getting results from multiple linear models into a
>> dataframe.
>> Let me explain the problem.
>>
>> I have a dataframe with ejection fraction results measured over a
>> number of quartiles and grouped by base_study.
>> My dataframe (800 different base_studies) looks like
>>
>>> afvtprelvefs
>> basestudy     quartile   ef        ef_std   entropy
>> CBP0908020  1           21.6    0.53        3.27
>> CBP0908020  2           32.5    0.61        3.27
>> CBP0908020  3           30.8    0.63        3.27
>> CBP0908020  4           33.6    0.37        3.27
>> CBP0908022  1           42.4    0.52        1.80
>> CBP0908021  1           29.4    0.70        2.63
>> CBP0908021  2           29.2    0.42        2.63
>> CBP0908021  3           29.7    0.89        2.63
>> CBP0908021  4           29.3    0.50        2.63
>> CBP0908022  2           45.7    1.30        1.80
>> ...
>>
>> What I want to do is apply a weighted linear fit to the results from
>> each base study and get the gradient out of it. I then want to plot
>> the gradient against the entropy (which is constant for each base
>> study).
>>
>> I can get apply a linear fit with
>>
>>> fits <- by(afvtprelvefs, afvtprelvefs$basestudy, function (x) lm
>>> (ef ~ quartile, data=x, weights=1/ef_std))
>>
>> but how do I get the results from that into a dataframe which I can
>> use?
>>
>> I thought I might get somewhere with
>>> sapply(fits, "[[", "coefficients")
>>
>> But that doesn't give me the basestudy separately so that I can
>> match up the results with the entropy results.
>
> The by objects don't play nicely with as.data.frame so I went to a
> more "classical" way of runnning the lm call and I added a coef()
> wrapper to just get the coefficients:
>
>> splits <-split(afvtprelvefs, afvtprelvefs$basestudy)
>> lapply(splits, function (x) coef(lm (ef ~ quartile, data=x,
> weights=1/ef_std)))
> $CBP0908020
> (Intercept)    quartile
>   20.921397    3.385469
>
> $CBP0908021
> (Intercept)    quartile
> 29.31632071  0.01372604
>
> $CBP0908022
> (Intercept)    quartile
>        39.1         3.3
>
>> fits <- lapply(splits, function (x) coef(lm (ef ~ quartile, data=x,
> weights=1/ef_std)))
>> as.data.frame(fits)
>             CBP0908020  CBP0908021 CBP0908022
> (Intercept)  20.921397 29.31632071       39.1
> quartile      3.385469  0.01372604        3.3
>
>
> The split-lapply strategy is reasonably general. You may need to use
> t() if you were hoping for stufy to be by rows. In this case sapply
> would have obviated the need for the as.data.frame step at the cost of
> returning a matrix rather than a data.frame.
> --
> David
>
>>
>> I am sure this must have been answered somewhere before but I have
>> been unable to find a solution.
>> Many thanks for your help
>>
>> Sandy Small
>> NHS Greater Glasgow and Clyde
>>
>>
>> ********************************************************************************************************************
>>
>> This message may contain confidential information. If yo...{{dropped:
>> 24}}
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ********************************************************************************************************************
>
> This message may contain confidential information. If you are not  
> the intended recipient please inform the
> sender that you have received the message in error before deleting it.
> Please do not disclose, copy or distribute information in this e- 
> mail or take any action in reliance on its contents:
> to do so is strictly prohibited and may be unlawful.
>
> Thank you for your co-operation.
>
> NHSmail is the secure email and directory service available for all  
> NHS staff in England and Scotland
> NHSmail is approved for exchanging patient data and other sensitive  
> information with NHSmail and GSI recipients
> NHSmail provides an email address for your career in the NHS and can  
> be accessed anywhere
> For more information and to find out how you can switch, visit www.connectingforhealth.nhs.uk/nhsmail
>
> ********************************************************************************************************************
>



More information about the R-help mailing list