[R] More list to vector puzzle

Blanchette, Marco MAB at stowers-institute.org
Thu Nov 20 04:15:48 CET 2008


Many thanks for the answers on my previous question, it got me started.
Indeed, stack() was the function I was vaguely remembering.

However, I didn¹t get very far because my data set is way more complicated
then I expected. In fact I have a mixture of levels and lists within a list.
Basically, it resemble the following list (named data) made of the levels H
and the list of lists A and T. for each level, the T[x]s are the same but
the A[x]s are different.

>H <- c(rep('H1',3),rep('H2',3),rep('H3',3))
> A <- list(A1=round(runif(3,100,1000)),
+  A2=round(runif(3,100,1000)),
+  A3=round(runif(3,100,1000)),
+  A4=round(runif(3,100,1000)),
+  A5=round(runif(3,100,1000)),
+  A6=round(runif(3,100,1000)),
+  A7=round(runif(3,100,1000)),
+  A8=round(runif(3,100,1000)),
+  A9=round(runif(3,100,1000))
+  )
> T1 <- round(runif(7,1,10))
> T2 <- round(runif(5,1,10))
> T3 <- round(runif(6,1,10))
> T <- list(T1,T1,T1,T2,T2,T2,T3,T3,T3)
> data <- list(H=H,A=A,T=T)

Basically, it can be represented as the following data structure:
H     A               T
H1    458 255 160     4  8  10 8  9  9  3
H1    343 424 298     4  8  10 8  9  9  3
H1    608 831 544     4  8  10 8  9  9  3

H2    616 266 413     7  3  5  4  5
H2    687 796 752     7  3  5  4  5
H2    814 921 228     7  3  5  4  5

H3    789 558 400     8  3  3  7  6  5
H3    845 298 855     8  3  3  7  6  5
H3    725 366 621     8  3  3  7  6  5

My goal is to get for each level of H a data frame of the value of As with
an indices representing what level of A it is coming and a single
representation of the Ts with a corresponding level. And so for every Hs. My
goal is to apply a linear model of value~ind for each H (of course, the data
are fake here) followed by an anova analysis for each H. Thus, for each
level of H I need something similar to:

$H1
value ind
458 A1
255 A1
160 A1
343 A2
424 A2
298 A2
608 A3
831 A3
544 A3
4   T
8   T
10  T
8   T
9   T
9   T
3   T
...

As you might have guess, we have several tens of thousand of Hs, thus, I
cannot just do it manually one at a time. I tried breaking down the problem
into small pieces but ended up not very far.

I was very excited when I got the following call to produce the expected
result:

> a <- tapply(data$A,data$H,function(x) stack(x))
> t <- tapply(data$T,data$H,function(x) x[1])
> tt <- lapply(t,function(x) data.frame(values=unlist(x),
+ ind=rep(1:length(x),sapply(x,length))))
>a
$H1
  values ind
1    458  A1
2    255  A1
3    160  A1
4    343  A2
5    424  A2
6    298  A2
7    608  A3
8    831  A3
9    544  A3
...

> tt
$H1
  values ind
1      4   1
2      8   2
3     10   3
4      8   4
5      9   5
6      9   6
7      3   7
...

However, I tried to rbind the list in a and tt (which represent the H level)
using lapply or sapply without any success.

I am in need of some guru advices on this one...

Also, I am not sure this is the most elegant want to produce the data
structure I am trying to build. Any advice?

Thanks

--
Marco Blanchette, Ph.D.
Assistant Investigator
Stowers Institute for Medical Research
1000 East 50th St.

Kansas City, MO 64110

Tel: 816-926-4071
Cell: 816-726-8419
Fax: 816-926-2018



More information about the R-help mailing list