[R] when to use & pros/cons of dataframe vs. matrix?
arun
smartpink111 at yahoo.com
Thu Jun 27 20:41:11 CEST 2013
Hi,
set.seed(24)
dat1<-data.frame(X=sample(letters,20,replace=TRUE),Y=sample(1:40,20,replace=TRUE),stringsAsFactors=FALSE)
mat1<-as.matrix(dat1)
sapply(dat1,class)
# X Y
#"character" "integer"
sapply(split(mat1,col(mat1)),class)
# 1 2
#"character" "character"
str(as.data.frame(mat1))
#'data.frame': 20 obs. of 2 variables:
# $ X: Factor w/ 14 levels "b","d","f","g",..: 5 3 11 8 10 14 5 12 13 4 ...
# $ Y: Factor w/ 14 levels "10","13","15",..: 12 5 9 13 14 8 12 6 7 4 ...
If you have data of the same type, matrix would be faster when compared to data.frame.
set.seed(245)
mat2<- matrix(sample(1:50,3*1e7,replace=TRUE),ncol=3)
dat2<- as.data.frame(mat2)
system.time(res1<- rowSums(mat2))
# user system elapsed
# 0.132 0.016 0.201
system.time(res2<- rowSums(dat2))
# user system elapsed
# 0.376 0.056 0.447
identical(res1,res2)
#[1] TRUE
A.K.
----- Original Message -----
From: Anika Masters <anika.masters at gmail.com>
To: R help <r-help at r-project.org>
Cc:
Sent: Thursday, June 27, 2013 2:26 PM
Subject: [R] when to use & pros/cons of dataframe vs. matrix?
When "should" I use a dataframe vs. a matrix? What are the pros and cons?
If I have data of all the same type, am I usually better off using a
matrix and not a dataframe?
What are the advantages if any of using a dataframe vs. a matrix?
(rownames and column names perhaps?)
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list