[R] how to create a plot of permutation of 30 random values and show proportion of values
Ana Marija
@okov|c@@n@m@r|j@ @end|ng |rom gm@||@com
Fri Jan 24 05:05:12 CET 2020
Hi Jim,
thanks for getting back to me.
Can you please confirm if you can see this plot in attach?
Thanks
Ana
On Thu, Jan 23, 2020 at 8:06 PM Jim Lemon <drjimlemon using gmail.com> wrote:
>
> Hi Ana,
> You seem to be working on an identification or classification problem.
> Your sample plot didn't come through, perhaps try converting it to a
> PDF or PNG.
> I may be missing something, but I can't see how randomly selecting 30
> values from almost 4 million is going to mean anything in terms of
> statistical significance. I hope you will pardon me for saying that it
> looks like a "p-trawl". It is easy to select cases where the p-value
> is less than 0.05:
>
> a[a$pvalue < 0.05,]
>
> Maybe what you want to do is display this subset of your data as
> candidates for a match among the very large number of non-matches.
> Let's do a bit of damage to your sample data and add the proportions:
>
> a<-read.table(text="rs pvalue pSNP
> rs185642176 0.0267407 0.6
> rs184120752 0.0787681 0.3
> rs10904045 0.0508162 0.4
> rs35849539 0.0875910 0.2
> rs141633513 0.0787759 0.2
> rs4468273 0.0542171 0.4
> rs4567378 0.0539484 0.4
> rs7084251 0.0126445 0.7
> rs181605000 0.0787838 0.35
> rs12255619 0.0192719 0.61
> rs140367257 0.0788008 0.25
> rs10904178 0.0969814 0.16
> rs7918960 0.0436341 0.45
> rs61688896 0.0526256 0.39
> rs151283848 0.0787284 0.34
> rs140174295 0.0989107 0.11
> rs145945079 0.0787015 0.23
> rs4881370 0.0455089 0.51
> rs183895035 0.0787015 0.22
> rs181749526 0.0787015 0.22",
> header=TRUE,stringsAsFactors=FALSE)
> alt05<-a[a$pvalue < 0.05,]
> library(plotrix)
> segmat<-matrix(c(alt05$pSNP,alt05$pSNP-0.1,alt05$pSNP+0.1,rep(1,5)),
> nrow=4,byrow=TRUE)
> rownames(segmat)<-c("prop","lower","upper","N")
> centipede.plot(segmat,mar=c(4,6,3,4),
> main="Proportion of SNPs",
> left.labels=alt05$rs,right.labels=rep("",5))
>
> This is probably not what you want, but it is a start.
>
> Jim
>
> On Fri, Jan 24, 2020 at 7:08 AM Ana Marija <sokovic.anamarija using gmail.com> wrote:
> >
> > Hello,
> >
> > I have a data frame which looks like this:
> >
> > > head(a,20)
> > rs pvalue
> > 1: rs185642176 0.267407
> > 2: rs184120752 0.787681
> > 3: rs10904045 0.508162
> > 4: rs35849539 0.875910
> > 5: rs141633513 0.787759
> > 6: rs4468273 0.542171
> > 7: rs4567378 0.539484
> > 8: rs7084251 0.126445
> > 9: rs181605000 0.787838
> > 10: rs12255619 0.192719
> > 11: rs140367257 0.788008
> > 12: rs10904178 0.969814
> > 13: rs7918960 0.436341
> > 14: rs61688896 0.526256
> > 15: rs151283848 0.787284
> > 16: rs140174295 0.989107
> > 17: rs145945079 0.787015
> > 18: rs4881370 0.455089
> > 19: rs183895035 0.787015
> > 20: rs181749526 0.787015
> > > dim(a)
> > [1] 3859763 2
> >
> > What I would like to do is to take random subsets of 30 of those rs
> > throughout the dataframe and find out which subsets of those generated
> > have FDR value <0.05
> >
> > FDR I would calculate I guess with:
> > a$fdr=p.adjust(a$pvalue,method="BH")
> >
> > but I also guess I would be calculating only FDR for a particular
> > subset of 30 randomly chosen rs, not for the whole data set.
> >
> > The result I would like to present like in the attached plot. The
> > x-axis say proportion of SNPs and in my case SNP is equivalent to rs
> >
> > Can you please help with this, I really don't have idea how to go about this.
> >
> > Thanks
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rsz_screen_shot_2020-01-23_at_100147_pm.png
Type: image/png
Size: 74890 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200123/57224a4e/attachment.png>
More information about the R-help
mailing list