[R] speed issues / pros & cons: dataframe vs. matrix
Duncan Murdoch
murdoch at stats.uwo.ca
Sat Jun 23 00:59:38 CEST 2007
On 22/06/2007 6:21 PM, Thomas Pujol wrote:
> I've read that certain operations performed on a matrix (e.g. ribind, cbind) are often much faster compared to operations performed on a data frame.
>
> Other then the "bind functions", what are the main operations that are significantly faster on a a matrix?
Indexing (e.g. x[1,3]) is much faster on a matrix.
>
> I know that data frames allow for columnnames and rownames, and that each column in a data frame can have different data types. Are there any other advantages of storing data in a a dataframe rather then a matrix?
Data frames are lists, so you can use things like df$columnname,
with(df, expression), attach(df), etc. Data frame columns have names,
but matrices don't necessarily.
I'd generally use data frames in any situation where the rows are cases
and the columns are characteristics, until I found they were too slow:
and then I'd consider temporary conversion to a matrix to speed things
up. As Knuth said, premature optimization is the root of all evil.
Duncan Murdoch
More information about the R-help
mailing list