[R] how to create a txt file with parsed columns

Mon Dec 9 06:03:45 CET 2019

Hi Ana,
Is this what you want?

a<-read.table(text="GENE        rs       BETA
1  ENSG00000154803 rs2605134  0.0360182
2  ENSG00000154803 rs7405677  0.0525463
3  ENSG00000154803 rs7211573  0.0525531
4  ENSG00000154803 rs2746026  0.0466392
5  ENSG00000141030 rs2605134  0.0806140
6  ENSG00000141030 rs7405677  0.0251654
7  ENSG00000141030 rs7211573  0.0252775
8  ENSG00000141030 rs2746026  0.0976396
9  ENSG00000205309 rs2605134  0.0838975
10 ENSG00000205309 rs7405677 -0.2148500
11 ENSG00000205309 rs7211573 -0.2148170
12 ENSG00000205309 rs2746026  0.1013920
13 ENSG00000215030 rs2605134  0.1261050
14 ENSG00000215030 rs7405677  0.0165236
15 ENSG00000215030 rs7211573  0.0163509
16 ENSG00000215030 rs2746026  0.1201180
17 ENSG00000141026 rs2605134  0.0485897
18 ENSG00000141026 rs7405677 -0.0929964
19 ENSG00000141026 rs7211573 -0.0930321
20 ENSG00000141026 rs2746026  0.0623033",
header=TRUE,stringsAsFactors=FALSE)
b<-read.table(text="rs       GWAS
1  rs2605134  0.0315177
2  rs7405677 -0.0816389
3  rs7211573 -0.0797796
4  rs2746026  0.0199350
5 rs11658521  0.0728377
6  rs9914107  0.0720096
7 rs56964223  0.0723903",
header=TRUE,stringsAsFactors=FALSE)
ab<-merge(a,b,by="rs")
library(prettyR)
abc<-stretch_df(ab,idvar="rs",to.stretch=c("GENE","BETA"))

Jiim

On Mon, Dec 9, 2019 at 11:10 AM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>
> Hello,
>
> I have two data frames:
>
> head(a)
>               GENE        rs       BETA
> 1  ENSG00000154803 rs2605134  0.0360182
> 2  ENSG00000154803 rs7405677  0.0525463
> 3  ENSG00000154803 rs7211573  0.0525531
> 4  ENSG00000154803 rs2746026  0.0466392
> 5  ENSG00000141030 rs2605134  0.0806140
> 6  ENSG00000141030 rs7405677  0.0251654
> 7  ENSG00000141030 rs7211573  0.0252775
> 8  ENSG00000141030 rs2746026  0.0976396
> 9  ENSG00000205309 rs2605134  0.0838975
> 10 ENSG00000205309 rs7405677 -0.2148500
> 11 ENSG00000205309 rs7211573 -0.2148170
> 12 ENSG00000205309 rs2746026  0.1013920
> 13 ENSG00000215030 rs2605134  0.1261050
> 14 ENSG00000215030 rs7405677  0.0165236
> 15 ENSG00000215030 rs7211573  0.0163509
> 16 ENSG00000215030 rs2746026  0.1201180
> 17 ENSG00000141026 rs2605134  0.0485897
> 18 ENSG00000141026 rs7405677 -0.0929964
> 19 ENSG00000141026 rs7211573 -0.0930321
> 20 ENSG00000141026 rs2746026  0.0623033
>
> head(b)
>           rs       GWAS
> 1  rs2605134  0.0315177
> 2  rs7405677 -0.0816389
> 3  rs7211573 -0.0797796
> 4  rs2746026  0.0199350
> 5 rs11658521  0.0728377
> 6  rs9914107  0.0720096
> 7 rs56964223  0.0723903
>
> Data frame a has:
> > length(unique(a$GENE))
> [1] 51
> > dim(a)
> [1] 287   3
>
> and the whole data frame b is shown
>
> I would like to create a txt file which would have rs match for each
> ENSG from data frame b. If a particular ENSG does not have matching rs
> from data frame b the value under it would be zero. So the txt file
> would have 7 rows (for all those unique rs from data frame b) and 53
> columns (for 51 ENSGs and one for unique rs and one for GWAS)
>
> So one row of that txt file would look like this.
>
> GENES       ENSG00000154803   ENSG00000141030  ENSG00000205309
> ENSG00000215030    ENSG00000141026  GWAS
> rs2605134   0.0360182         0.0806140         0.0838975
> 0.1261050           0.0485897       0.0315177
> …
>
> Please advise,
> Ana
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.