[BioC] Variance stabilization of m-values
Gustavo Fernández Bayón
gbayon at gmail.com
Fri Aug 24 10:06:40 CEST 2012
Hi Tim.
Sorry for the late reply. (OFFTOPIC: my third child decided to be born :) the day after I asked the question in the list, so I have been on paternal leave, and really had no time to answer the emails)
The arcsin proposal is very interesting. I'll give a try too, although, as I have answered to Dr. Smyth, I do not exactly know if the curve is really important as I thought it was the first time. I am currently re-working on that pipeline, because I have to remember the exact point where I was twenty days before, and that is sometimes hard :)
Thank you very much for your hints
Regards,
Gus
---------------------------
Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)
El viernes 3 de agosto de 2012 a las 04:16, Tim Triche, Jr. escribió:
> The mean-variance plot should be far "more" horizontal with M-values than beta-values; have you plotted it against total intensity? You end up going down the rabbit hole eventually due to copy number variation, but plotting m-value variance against the mean, the line of best fit is nearly flat across the range of values. The variance is more U-shaped (as opposed to the "n" shape with beta values).
>
> You could try an arcsin transform
>
> asin(sqrt(beta)))
>
> if your primary goal is to stabilize the variance, though Dr. Smyth's suggestion will probably be better for sensitivity in the end.
>
> Just a thought. There are many ways to transform a proportion and they all have relative strengths and weaknesses in practice.
>
>
>
> On Thu, Aug 2, 2012 at 4:19 PM, Gordon K Smyth <smyth at wehi.edu.au (mailto:smyth at wehi.edu.au)> wrote:
> > Use eBayes with trend=TRUE later in the pipeline, then variance stabilization may not be needed.
> >
> > Gordon
> >
> > > Date: Wed, 1 Aug 2012 15:20:56 +0200
> > > From: Gustavo Fern?ndez Bay?n <gbayon at gmail.com (mailto:gbayon at gmail.com)>
> > > To: bioconductor at r-project.org (mailto:bioconductor at r-project.org)
> > > Subject: [BioC] Variance stabilization of m-values
> > >
> > > Hi everybody.
> > >
> > > I am working with Illumina 450k methylation data. I am currently cleaning a data set, getting rid of XY probes, etc? and I would like to do a non-specific filtering and preserve only 20% of the probes, those with the higher variability (as seen in Chapter 7 of the Bioconductor Case Studies book).
> > >
> > > In the book, they create a meanSdPlot() and proceed as the variance is not dependent on the mean (to a significant degree).
> > >
> > > Trying to follow that procedure, I have converted my beta values to M-values, and then called meanSdPlot(). It shows, for my data, that there is a relationship between mean and variance, i.e. the line with the median is not horizontal. Of course, if I create a meanSdPlot with the beta values, the effect is greater, due to their heteroscedasticity.
> > >
> > > Question: Is it correct to use a variance stabilization transformation (as the one in justvsn) on the M-values in order to discard low-variance probes?
> > >
> > > Any hint will be much appreciated.
> > >
> > > Regards,
> > > Gus
> >
> >
> > ______________________________________________________________________
> > The information in this email is confidential and intend...{{dropped:4}}
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org (mailto:Bioconductor at r-project.org)
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
> --
> A model is a lie that helps you see the truth.
>
> Howard Skipper (http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf)
More information about the Bioconductor
mailing list