Thomas Lumley tlumley at u.washington.edu
Mon Oct 29 18:10:57 CET 2001

```On Mon, 29 Oct 2001 Pierre.Ilouga at evotecoai.com wrote:

>
> Hi to all.
> I'm a relative new R user and I would like someone to help me in solving
> the following simple question:
> I have a two column (L , T ) data frame. Column L has three levels: a, b ,
> and c  and Column T is real valued.
>
>   L          T
>   a    0.8
>   a   0.9
>   a   0.1
>   a   -0.8
>   a  -1.0
>   b  -0.3
>   b    0.7
>   b       1.0
>   b   0.4
>   c  -0.5
>   c  -1.0
>
> My aim is to get a data frame of the form
>
>       L    T1       T2   T3    T4  T5
>      a    0.8   0.9      0.1  -0.8       -1
>      b    -0.3      0.7   1.0      0.4   NA
>       c   -0.5  -1.0     NA        NA     NA
>
> without using a "for"  loop, but by using functions of the "apply" family
> for exemple or something specific to R and more efficient.

The speed advantage of the `apply' family over for() loops in the S
dialects is legendary -- partly history and partly myth. The reason to use
apply is usually more for clarity than speed.

If your dataframe is called `data' I would do something like
rows<-split(data\$T,data\$L)
n<-max(sapply(rows,length))

Ti<-matrix(NA, ncol=n, nrow=length(rows))
for(i in 1:length(rows))
Ti[i,1:length(rows[[i]])]<-rows[[i]]

You could replace the loop with something like
Ti<-do.call("rbind",lapply(rows,function(r) c(r,rep(NA,n-length(r)))))
but in this case it would be less clear than the loop and is unlikely to
be faster.

-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

