[R] From data.frame to Matrix
Thomas Lumley
tlumley at u.washington.edu
Mon Oct 29 18:10:57 CET 2001
On Mon, 29 Oct 2001 Pierre.Ilouga at evotecoai.com wrote:
>
> Hi to all.
> I'm a relative new R user and I would like someone to help me in solving
> the following simple question:
> I have a two column (L , T ) data frame. Column L has three levels: a, b ,
> and c and Column T is real valued.
>
> L T
> a 0.8
> a 0.9
> a 0.1
> a -0.8
> a -1.0
> b -0.3
> b 0.7
> b 1.0
> b 0.4
> c -0.5
> c -1.0
>
> My aim is to get a data frame of the form
>
> L T1 T2 T3 T4 T5
> a 0.8 0.9 0.1 -0.8 -1
> b -0.3 0.7 1.0 0.4 NA
> c -0.5 -1.0 NA NA NA
>
> without using a "for" loop, but by using functions of the "apply" family
> for exemple or something specific to R and more efficient.
The speed advantage of the `apply' family over for() loops in the S
dialects is legendary -- partly history and partly myth. The reason to use
apply is usually more for clarity than speed.
If your dataframe is called `data' I would do something like
rows<-split(data$T,data$L)
n<-max(sapply(rows,length))
Ti<-matrix(NA, ncol=n, nrow=length(rows))
for(i in 1:length(rows))
Ti[i,1:length(rows[[i]])]<-rows[[i]]
answer<-data.frame(names(rows),Ti)
names(answer)<-c("L",paste("T",1:n,sep=""))
You could replace the loop with something like
Ti<-do.call("rbind",lapply(rows,function(r) c(r,rep(NA,n-length(r)))))
but in this case it would be less clear than the loop and is unlikely to
be faster.
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list