[R] R runtime performance and memory usage

Sasikumar Kandhasamy ckmsasi at gmail.com
Tue Nov 17 00:25:03 CET 2015


Thanks a lot Bill & Bert.

Hi Bill,

Sorry i was wrong on number of records, actually, i am using two
dimensional data of 250K records each. And regarding CPU usage, it was the
elapsed time. Infact, i have pined one core to run R.

Thanks & Regards
Sasi

On Mon, Nov 16, 2015 at 2:04 PM, William Dunlap <wdunlap at tibco.com> wrote:

> You cannot do a linear regression with one column of data - there must
> be at least one response column and one predictor.  By default, lm
> throws in a constant term which gives you a second predictor.  If your
> predictor is categorical, you get a new column for all but the first
> unique value in it.
>
> lm() deals only with double precision data, at 8 bytes/number.  Thus
> 250k numbers occupies 2 million bytes.  Your three columns (in the
> non-categorical-predictor case)  take up 6 million bytes,
>
> lm()'s output contains several columns the size of the response
> variable: residuals, effects, and fitted.values.  It also contains the
> QR decomposition of the design matrix (the size of all the predictor
> columns together).
>
> There are also some temporary variables generated in the course of the
> computation.
>
> So your observed 40 MB memory usage seems reasonable.
>
> Use the object.size() function to see how big objects are and str() to
> look at their structure.
>
> My laptop with  a 2.5 GHz Intel i7 processor takes a quarter second to
> fit a simple linear model with one numeric predictor and a constant
> term.  6 seconds sounds slow.  Is that cpu or elapsed time (use
> system.time() to see)?
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Mon, Nov 16, 2015 at 12:25 PM, Sasikumar Kandhasamy
> <ckmsasi at gmail.com> wrote:
> > Hi All,
> >
> > I have couple of clarifications on R run-time performance. I have R-3.2.2
> > package compiled for MIPS64 and am running it on my linux machine with
> > mips64 processor (core speed 1.5GHz) and observing the following
> behaviors,
> >
> > 1. Applying "linear regression model" (lm) on 1MB of data (contains 1
> > column of 250K records) takes ~6 seconds to complete. Anyidea, is it an
> > expected behavior or not? If not, can you please the suggestions or
> options
> > to improve if we have any?
> >
> > 2. Also, the R process runtime virtual memory is increased by 40MB after
> > applying the linear model on 1MB data. Is it also expected behavior? If
> it
> > is expected, can you please share the insight of memory usage?
> >
> > Thanks in advance.
> >
> > Regards
> > Sasi
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list