[Rd] Lightweight data frame class

Gabor Grothendieck ggrothendieck at myway.com
Sat Nov 27 00:41:47 CET 2004


Vadim Ogranovich <vograno <at> evafunds.com> writes:

: 
: Don't know whether it will suffice. Lm() was just an example. Are you
: going to re-write lm(), e.g. lm.zoo(), to accept lists?

A previous unreleased version of zoo did hack lm but the current test
version interfaces to lm without making any changes to lm at all.  

: I am more thinking of a general purpose class that would pass wherever
: data.frame is expected.

Yes, I figured so.  The lightweight data frame idea seems neat 
but thought I would mention this, in addition, in case its germane.

: 
: Probably I need to wait until the new version of zoo comes out. At the
: very least it could be a good prototype for what I have in mind.

If you want it before then contact me offlist and I can send you 
the beta test version.

: 
: Thanks for the info,
: Vadim
: 
: > -----Original Message-----
: > From: r-devel-bounces <at> stat.math.ethz.ch 
: > [mailto:r-devel-bounces <at> stat.math.ethz.ch] On Behalf Of Gabor 
: > Grothendieck
: > Sent: Thursday, November 25, 2004 7:42 PM
: > To: r-devel <at> stat.math.ethz.ch
: > Subject: Re: [Rd] Lightweight data frame class
: > 
: > Vadim Ogranovich <vograno <at> evafunds.com> writes:
: > 
: > : 
: > : Hi,
: > : 
: > : As far as I can tell data.frame class adds two features to those of
: > : lists:
: > : * matrix structure via [,] and [,]<- operators  (well, I 
: > know these are
: > : actually "["(i, j, ...), not "[,]"). 
: > : * row names attribute.
: > : 
: > : It seems that the overhead of the support for the row names, both
: > : computational and RAM-wise, is rather non-trivial. I frequently
: > : subscript from a data.frame, i.e. use [,] on data frames, 
: > and my timing
: > : shows that the equivalent list operation is about 7 times 
: > faster, see
: > : below.
: > : 
: > : On the other hand, at least in my usage pattern, I really 
: > rarely benefit
: > : from the row names attribute, so as far as I am concerned 
: > row names is
: > : just an overhead. (Of course the speed difference may be 
: > due to other
: > : factors, the only thing I can tell is that subscripting is 
: > very slow in
: > : data frames relative to in lists).
: > : 
: > : I thought of writing a new class, say 
: > lightweight.data.frame, that would
: > : be polymorphic with the existing data.frame class. The class would
: > : inherit from "list" and implement [,], [,]<- operators. It 
: > would also
: > : implement the "rownames" function that would return 
: > seq(nrow(x)), etc.
: > : It should also implement as.data.frame to avoid the overhead of
: > : conversion to a full-blown data.frame in calls like lm(y ~ x,
: > : data=myLightweightDataframe).
: > 
: > The next version of zoo (currently in
: > test) supports lists in the data argument of lm and can also 
: > merge zoo series into a list (or to another zoo series, as it 
: > does now).
: > Would that be a sufficient alternative?
: > 
: > ______________________________________________
: > R-devel <at> stat.math.ethz.ch mailing list
: > https://stat.ethz.ch/mailman/listinfo/r-devel
: >
: 
: ______________________________________________
: R-devel <at> stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-devel
: 
:



More information about the R-devel mailing list