[Rd] Lightweight data frame class

Vadim Ogranovich vograno at evafunds.com
Fri Nov 26 23:01:36 CET 2004


Don't know whether it will suffice. Lm() was just an example. Are you
going to re-write lm(), e.g. lm.zoo(), to accept lists?
I am more thinking of a general purpose class that would pass wherever
data.frame is expected.

Probably I need to wait until the new version of zoo comes out. At the
very least it could be a good prototype for what I have in mind.

Thanks for the info,
Vadim

> -----Original Message-----
> From: r-devel-bounces at stat.math.ethz.ch 
> [mailto:r-devel-bounces at stat.math.ethz.ch] On Behalf Of Gabor 
> Grothendieck
> Sent: Thursday, November 25, 2004 7:42 PM
> To: r-devel at stat.math.ethz.ch
> Subject: Re: [Rd] Lightweight data frame class
> 
> Vadim Ogranovich <vograno <at> evafunds.com> writes:
> 
> : 
> : Hi,
> : 
> : As far as I can tell data.frame class adds two features to those of
> : lists:
> : * matrix structure via [,] and [,]<- operators  (well, I 
> know these are
> : actually "["(i, j, ...), not "[,]"). 
> : * row names attribute.
> : 
> : It seems that the overhead of the support for the row names, both
> : computational and RAM-wise, is rather non-trivial. I frequently
> : subscript from a data.frame, i.e. use [,] on data frames, 
> and my timing
> : shows that the equivalent list operation is about 7 times 
> faster, see
> : below.
> : 
> : On the other hand, at least in my usage pattern, I really 
> rarely benefit
> : from the row names attribute, so as far as I am concerned 
> row names is
> : just an overhead. (Of course the speed difference may be 
> due to other
> : factors, the only thing I can tell is that subscripting is 
> very slow in
> : data frames relative to in lists).
> : 
> : I thought of writing a new class, say 
> lightweight.data.frame, that would
> : be polymorphic with the existing data.frame class. The class would
> : inherit from "list" and implement [,], [,]<- operators. It 
> would also
> : implement the "rownames" function that would return 
> seq(nrow(x)), etc.
> : It should also implement as.data.frame to avoid the overhead of
> : conversion to a full-blown data.frame in calls like lm(y ~ x,
> : data=myLightweightDataframe).
> 
> The next version of zoo (currently in
> test) supports lists in the data argument of lm and can also 
> merge zoo series into a list (or to another zoo series, as it 
> does now).
> Would that be a sufficient alternative?
> 
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list