[R] how to "singlify" entries
Charles Plessy
charles-r-nospam at plessy.org
Mon May 30 16:54:59 CEST 2005
On Mon, May 30, 2005 at 09:09:27AM -0400, Gabor Grothendieck wrote :
> Try using reshape, e.g. if dd is your data frame:
>
> reshape(dd, dir = "wide", idvar = "F1", timevar = "F2",
> varying = list(c("VX","VY")))
Thank you very much, and to Petr Pikal too. Reshape is exactly what I had forgotten.
Now the bad news is that I have simplified my example ; I am in a
slightly more complex situation :
I have three factors, and one value
> count_per_tc[1:10,]
rna lib tc x
1 CAB 114BA T01F00380F47 1
2 CAE 114BB T01F00381273 1
3 CAJ 114BA T01F0048F6D1 1
4 CAB 114BC T01F0048F6D1 1
5 CAB 114BA T01F00498689 2
6 CAC 114BA T01F00498689 1
7 CAE 114BA T01F00498689 2
8 CAG 114BA T01F00498689 2
9 CAH 114BA T01F00498689 1
10 CAI 114BA T01F00498689 2
I would like a data frame where I have the value of x for each combination of
"rna" and "lib", for each "tc"
> reshape(count_per_tc[1:10,], direction="wide", timevar="tc", idvar=c("rna","lib"))
rna lib x.T01F00380F47 x.T01F00381273 x.T01F0048F6D1 x.T01F00498689
1 CAB 114BA 1 NA NA 2
2 CAE 114BB NA 1 NA NA
3 CAJ 114BA NA NA 1 NA
4 CAB 114BC NA NA 1 NA
6 CAC 114BA NA NA NA 1
7 CAE 114BA NA NA NA 2
8 CAG 114BA NA NA NA 2
9 CAH 114BA NA NA NA 1
10 CAI 114BA NA NA NA 2
oops, the other way round :
> t(reshape(count_per_tc[1:10,], direction="wide", timevar="tc", idvar=c("rna","lib")))
1 2 3 4 6 7 8 9 10
rna "CAB" "CAE" "CAJ" "CAB" "CAC" "CAE" "CAG" "CAH" "CAI"
lib "114BA" "114BB" "114BA" "114BC" "114BA" "114BA" "114BA" "114BA" "114BA"
x.T01F00380F47 " 1" NA NA NA NA NA NA NA NA
x.T01F00381273 NA " 1" NA NA NA NA NA NA NA
x.T01F0048F6D1 NA NA " 1" " 1" NA NA NA NA NA
x.T01F00498689 " 2" NA NA NA " 1" " 2" " 2" " 1" " 2"
The ultimate goal is (after proper renaming of the columns) to do things like
plot(CAA-114BA[CAA-114BA >0 & CAA-114BB > 0], CAA-114BB[CAA-114BA >0 & CAA-114BB > 0])
(this combination will appear if I reshape the whole data frame, which has 200,000 rows.)
and then proper statistical tests (which I still have to learn / remember from
12 years ago).
once again, thank you, and please warn me if I am doing something stupid with
this transposition of the reshaped table.
Best regards,
--
Charles
More information about the R-help
mailing list