[Rd] Speed difference between df$a[1] and df[1,"a"]
Thomas Lumley
tlumley at uw.edu
Fri Oct 21 07:23:24 CEST 2011
On Wed, Oct 19, 2011 at 2:34 PM, Stavros Macrakis <macrakis at alum.mit.edu> wrote:
> I was surprised to find that df$a[1] is an order of magnitude faster than
> df[1,"a"]:
Yes. This treats a data frame as a list, and is much faster.
> I thought this might be because df[,] builds a data frame before simplifying
> it to a vector, but with drop=F, it is even slower, so that doesn't seem to
> be the problem:
drop=FALSE creates a data frame first, and then simplifies it to a
vector, so this test isn't showing what you think it is.
> I then wondered if it might be because '[' allows multiple columns and
> handles rownames. Sure enough, '[[,]]', which allows only one column, and
> does not handle rownames, is almost 3x faster:
That's part of it, but if you look at [.data.frame you see there is
also quite a bit of copying that could be avoided in simple cases but
is hard to avoid in full generality.
-thomas
--
Thomas Lumley
Professor of Biostatistics
University of Auckland
More information about the R-devel
mailing list