[Rd] data frame subscription operator
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Nov 8 09:21:05 CET 2006
'[' is the 'subscript' or 'extraction', not 'subscription' operator: this
is also called 'indexing', as in 'An Introduction to R'.
On Mon, 6 Nov 2006, Vladimir Dergachev wrote:
> I was looking at the data frame subscription operator (attached in the end
> of this e-mail) and got puzzled by the following line:
>
> class(x) <- attr(x, "row.names") <- NULL
>
> This appears to set the class and row.names attributes of the incoming data
> frame to NULL.
Actually no, it removes them: see ?attr and ?class.
> So far I was not able to figure out why this is necessary -
> could anyone help ?
You need to remove the class to avoid recursion: a few lines later x[i]
needs to be a call to the primitive and not the data frame method.
> The reason I am looking at it is that changing attributes forces duplication
> of the data frame and this is the largest cause of slowness of data.frames in
> general.
Do you have evidence of that? R has facilities to profile its code, and I
have never seen [.data.frame taking a significant proportion of the total
time. If it does for your application, consider if a data frame is an
appropriate way to store your data. I am not sure we would accept that
data frames do have 'slowness in general', but their generality does make
them slower than alternatives where the generality is not needed.
[...]
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list