[R] how to create a txt file with parsed columns
Jim Lemon
drj|m|emon @end|ng |rom gm@||@com
Mon Dec 9 06:03:45 CET 2019
Hi Ana,
Is this what you want?
a<-read.table(text="GENE rs BETA
1 ENSG00000154803 rs2605134 0.0360182
2 ENSG00000154803 rs7405677 0.0525463
3 ENSG00000154803 rs7211573 0.0525531
4 ENSG00000154803 rs2746026 0.0466392
5 ENSG00000141030 rs2605134 0.0806140
6 ENSG00000141030 rs7405677 0.0251654
7 ENSG00000141030 rs7211573 0.0252775
8 ENSG00000141030 rs2746026 0.0976396
9 ENSG00000205309 rs2605134 0.0838975
10 ENSG00000205309 rs7405677 -0.2148500
11 ENSG00000205309 rs7211573 -0.2148170
12 ENSG00000205309 rs2746026 0.1013920
13 ENSG00000215030 rs2605134 0.1261050
14 ENSG00000215030 rs7405677 0.0165236
15 ENSG00000215030 rs7211573 0.0163509
16 ENSG00000215030 rs2746026 0.1201180
17 ENSG00000141026 rs2605134 0.0485897
18 ENSG00000141026 rs7405677 -0.0929964
19 ENSG00000141026 rs7211573 -0.0930321
20 ENSG00000141026 rs2746026 0.0623033",
header=TRUE,stringsAsFactors=FALSE)
b<-read.table(text="rs GWAS
1 rs2605134 0.0315177
2 rs7405677 -0.0816389
3 rs7211573 -0.0797796
4 rs2746026 0.0199350
5 rs11658521 0.0728377
6 rs9914107 0.0720096
7 rs56964223 0.0723903",
header=TRUE,stringsAsFactors=FALSE)
ab<-merge(a,b,by="rs")
library(prettyR)
abc<-stretch_df(ab,idvar="rs",to.stretch=c("GENE","BETA"))
Jiim
On Mon, Dec 9, 2019 at 11:10 AM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>
> Hello,
>
> I have two data frames:
>
> head(a)
> GENE rs BETA
> 1 ENSG00000154803 rs2605134 0.0360182
> 2 ENSG00000154803 rs7405677 0.0525463
> 3 ENSG00000154803 rs7211573 0.0525531
> 4 ENSG00000154803 rs2746026 0.0466392
> 5 ENSG00000141030 rs2605134 0.0806140
> 6 ENSG00000141030 rs7405677 0.0251654
> 7 ENSG00000141030 rs7211573 0.0252775
> 8 ENSG00000141030 rs2746026 0.0976396
> 9 ENSG00000205309 rs2605134 0.0838975
> 10 ENSG00000205309 rs7405677 -0.2148500
> 11 ENSG00000205309 rs7211573 -0.2148170
> 12 ENSG00000205309 rs2746026 0.1013920
> 13 ENSG00000215030 rs2605134 0.1261050
> 14 ENSG00000215030 rs7405677 0.0165236
> 15 ENSG00000215030 rs7211573 0.0163509
> 16 ENSG00000215030 rs2746026 0.1201180
> 17 ENSG00000141026 rs2605134 0.0485897
> 18 ENSG00000141026 rs7405677 -0.0929964
> 19 ENSG00000141026 rs7211573 -0.0930321
> 20 ENSG00000141026 rs2746026 0.0623033
>
> head(b)
> rs GWAS
> 1 rs2605134 0.0315177
> 2 rs7405677 -0.0816389
> 3 rs7211573 -0.0797796
> 4 rs2746026 0.0199350
> 5 rs11658521 0.0728377
> 6 rs9914107 0.0720096
> 7 rs56964223 0.0723903
>
> Data frame a has:
> > length(unique(a$GENE))
> [1] 51
> > dim(a)
> [1] 287 3
>
> and the whole data frame b is shown
>
> I would like to create a txt file which would have rs match for each
> ENSG from data frame b. If a particular ENSG does not have matching rs
> from data frame b the value under it would be zero. So the txt file
> would have 7 rows (for all those unique rs from data frame b) and 53
> columns (for 51 ENSGs and one for unique rs and one for GWAS)
>
> So one row of that txt file would look like this.
>
> GENES ENSG00000154803 ENSG00000141030 ENSG00000205309
> ENSG00000215030 ENSG00000141026 GWAS
> rs2605134 0.0360182 0.0806140 0.0838975
> 0.1261050 0.0485897 0.0315177
> …
>
> Please advise,
> Ana
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list