[R] speed issues / pros & cons: dataframe vs. matrix

Duncan Murdoch murdoch at stats.uwo.ca
Sat Jun 23 00:59:38 CEST 2007


On 22/06/2007 6:21 PM, Thomas Pujol wrote:
> I've read that certain operations performed on a matrix (e.g. ribind, cbind) are often much faster compared to operations performed on a data frame.
> 
> Other then the "bind functions", what are the main operations that are significantly faster on a a matrix?

Indexing (e.g. x[1,3]) is much faster on a matrix.
> 
> I know that data frames allow for columnnames and rownames, and that each column in a data frame can have different data types.  Are there any other advantages of storing data in a a dataframe rather then a matrix?

Data frames are lists, so you can use things like df$columnname, 
with(df, expression), attach(df), etc.  Data frame columns have names, 
but matrices don't necessarily.

I'd generally use data frames in any situation where the rows are cases 
and the columns are characteristics, until I found they were too slow: 
and then I'd consider temporary conversion to a matrix to speed things 
up.  As Knuth said, premature optimization is the root of all evil.

Duncan Murdoch



More information about the R-help mailing list