[R] Python and R
Esmail Bonakdarian
esmail.js at gmail.com
Thu Feb 19 14:24:28 CET 2009
Gabor Grothendieck wrote:
> On Wed, Feb 18, 2009 at 7:27 AM, Esmail Bonakdarian <esmail.js at gmail.com> wrote:
>> Gabor Grothendieck wrote:
>>>
>>> See ?Rprof for profiling your R code.
>>>
>>> If lm is the culprit, rewriting your lm calls using lm.fit might help.
>> Yes, based on my informal benchmarking, lm is the main "bottleneck", the
>> rest
>> of the code consists mostly of vector manipulations and control structures.
>>
>> I am not familiar with lm.fit, I'll definitely look it up. I hope it's
>> similar
>> enough to make it easy to substitute one for the other.
>>
>> Thanks for the suggestion, much appreciated. (My runs now take sometimes
>> several hours, it would be great to cut that time down by any amount :-)
>>
>
> Yes, the speedup can be significant. e.g. here we cut the time down to
> 40% of the lm time by using lm.fit and we can get down to nearly 10% if
> we go even lower level:
Wow those numbers look impressive, that would be a nice speedup to have.
I took a look at the manual and found the following at the top of
the description for lm.fit:
"These are the basic computing engines called by lm used to fit linear
models. These should usually not be used directly unless by experienced
users. "
I am certainly not an experienced user - so I wonder how different it
would be to use lm.fit instead of lm.
Right now I cobble together an equation and then call lm with it and the
datafile.
I.e.,
LM.1 = lm(as.formula(eqn), data=datafile)
s=summary(LM.1)
I then extract some information from the summary stats.
I'm not really quite sure what to make of the parameter list in lm.fit
I will look on-line and see if I can find an example showing the use of
this - thanks for pointing me in that direction.
Esmail
>> system.time(replicate(1000, lm(DAX ~.-1, EuStockMarkets)))
> user system elapsed
> 26.85 0.07 27.35
>> system.time(replicate(1000, lm.fit(EuStockMarkets[,-1], EuStockMarkets[,1])))
> user system elapsed
> 10.76 0.00 10.78
>> system.time(replicate(1000, qr.coef(qr(EuStockMarkets[,-1]), EuStockMarkets[,1])))
> user system elapsed
> 3.33 0.00 3.34
>> lm(DAX ~.-1, EuStockMarkets)
>
> Call:
> lm(formula = DAX ~ . - 1, data = EuStockMarkets)
>
> Coefficients:
> SMI CAC FTSE
> 0.55156 0.45062 -0.09392
>
>> # They call give the same coefficients:
>
>> lm.fit(EuStockMarkets[,-1], EuStockMarkets[,1])$coef
> SMI CAC FTSE
> 0.55156141 0.45062183 -0.09391815
>> qr.coef(qr(EuStockMarkets[,-1]), EuStockMarkets[,1])
> SMI CAC FTSE
> 0.55156141 0.45062183 -0.09391815
>
More information about the R-help
mailing list