[R] Python and R
Douglas Bates
bates at stat.wisc.edu
Sat Feb 21 18:15:55 CET 2009
Different methods of performing least squares calculations in R are discussed in
@Article{Rnews:Bates:2004,
author = {Douglas Bates},
title = {Least Squares Calculations in {R}},
journal = {R News},
year = 2004,
volume = 4,
number = 1,
pages = {17--20},
month = {June},
url = http,
pdf = Rnews2004-1
}
Some of the functions mentioned in that article have been modified. A
more up-to-date version of the comparisons in that article is
available as the Comparisons vignette in the Matrix package.
On Fri, Feb 20, 2009 at 6:06 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> Note that using solve can be numerically unstable for certain problems.
>
> On Fri, Feb 20, 2009 at 6:50 AM, Kenn Konstabel <lebatsnok at gmail.com> wrote:
>> Decyphering formulas seems to be the most time consuming part of lm:
>>
>> mylm1 <- function(formula, data) {
>> # not perfect but works
>> F <- model.frame(formula,data)
>> y <- model.response(F)
>> mt <- attr(F, "terms")
>> x <- model.matrix(mt,F)
>> coefs <- solve(crossprod(x), crossprod(x,y))
>> coefs
>> }
>>
>> mylm2 <- function(x, y, intercept=TRUE) {
>> if(!is.matrix(x)) x <- as.matrix(x)
>> if(intercept) x <- cbind(1,x)
>> if(!is.matrix(y)) y <- as.matrix(y)
>> solve(crossprod(x), crossprod(x,y))
>> }
>>
>>> system.time(for(i in 1:1000) mylm2(EuStockMarkets[,-1],
>>> EuStockMarkets[,"DAX"]))
>> user system elapsed
>> 6.43 0.00 6.53
>>> system.time(for(i in 1:1000) mylm1(DAX~., EuStockMarkets))
>> user system elapsed
>> 16.19 0.00 16.23
>>> system.time(for(i in 1:1000) lm(DAX~., EuStockMarkets))
>> user system elapsed
>> 21.43 0.00 21.44
>>
>> So if you need to save time, I'd suggest something close to mylm2 rather
>> than mylm1.
>>
>> Kenn
>>
>>
>>
>> On Thu, Feb 19, 2009 at 8:04 PM, Gabor Grothendieck
>> <ggrothendieck at gmail.com> wrote:
>>>
>>> On Thu, Feb 19, 2009 at 8:30 AM, Esmail Bonakdarian <esmail.js at gmail.com>
>>> wrote:
>>>
>>> > Thanks for the suggestions, I'll have to see if I can figure out how to
>>> > convert the relatively simple call to lm with an equation and the data
>>> > file
>>> > to the functions you mention (or if that's even feasible).
>>>
>>> X <- model.matrix(formula, data)
>>>
>>> will calculate the X matrix for you.
>>>
>>> >
>>> > Not an expert in statistics myself, I am mostly concentrating on the
>>> > programming aspects of R. Problem is that I suspect my colleagues who
>>> > are providing some guidance with the stats end are not quite experts
>>> > themselves, and certainly new to R.
>>> >
>>> > Cheers,
>>> >
>>> > Esmail
>>> >
>>> > Kenn Konstabel wrote:
>>> >>
>>> >> lm does lots of computations, some of which you may never need. If
>>> >> speed
>>> >> really matters, you might want to compute only those things you will
>>> >> really
>>> >> use. If you only need coefficients, then using %*%, solve and crossprod
>>> >> will
>>> >> be remarkably faster than lm
>>> >>
>>> >> # repeating someone else's example
>>> >> # lm(DAX~., EuStockMarkets)
>>> >>
>>> >> y <- EuStockMarkets[,"DAX"]
>>> >> x <- EuStockMarkets
>>> >> x[,1]<-1
>>> >> colnames(x)[1] <- "Intercept"
>>> >>
>>> >> lm(y ~ x-1)
>>> >> solve(crossprod(x), t(x))%*%y # probably this can be done more
>>> >> efficiently
>>> >>
>>> >> # and a naive timing
>>> >>
>>> >> > system.time( for(i in 1:1000) lm(y ~ x-1))
>>> >> user system elapsed
>>> >> 14.64 0.33 32.69
>>> >> > system.time(for(i in 1:1000) solve(crossprod(x), crossprod(x,y)) )
>>> >> user system elapsed
>>> >> 0.36 0.00 0.36
>>> >>
>>> >>
>>> >> Also lsfit() is a bit quicker than lm or lm.fit.
>>> >>
>>> >> Regards,
>>> >> Kenn
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> > http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list