[BioC] Help with DMPFinder in minfi package

Wed Jun 19 18:34:33 CEST 2013

Hi Srikanth,

On 6/19/2013 12:02 PM, Srinivas Srikanth Manda wrote:
> Hi James,
>
> Thanks for the response. I am a bit new to this kind of analysis. When 
> I use and  dmpFinder on three different groups Vs Control and I export 
> the results as data.frame, I only get columns corresponding to 
> probeid, intercept and F, p and q values but not the group name. I 
> want to have data in the form how the plotCpG function works (to see a 
> difference in methylation across the three groups). How can I achieve 
> this?
>
> My initial goal:
> Find significantly different probes across three different grades of 
> cancer samples.

You should note that dmpFinder() does just what you are asking, finding 
probes that are significantly different across one or more of your 
sample types. However, it is fitting a particular model that you might 
not like, and doing an F-test for ANY difference. You may want to do 
more directed analyses.

To that end, note that dmpFinder() is just a nice wrapper for doing the 
analysis, and you don't have to use it. You can just as easily use limma 
to do the univariate analyses. I won't get into details here, as the 
limma User's Guide has any number of examples you could emulate (and I 
have already given you a head start below).

> Map the probes to gene regions (like promotes, gene body, UTR, etc)

See ?mapToGenome for mapping the probes to genomic coordinates. You will 
then need to map significantly differentially methylated regions to 
genomic features and maybe make some nice plots. This isn't super 
difficult, but it does require a fair amount of base knowledge of a 
bunch of packages.

You should probably peruse the following:

http://bioconductor.org/help/course-materials/2013/SeattleFeb2013/IntermediateSequenceAnalysis2013.pdf

which covers a lot of what you want to do (differential methylation 
isn't particularly different in a lot of respects from e.g., RNA-seq or 
whatever. You have a genomic position that you think is 'interesting' 
and you might want to know if there is any known <something> nearby. The 
only difference is why you think it is interesting.).

There is no substitute for just trying to do something and doing your 
best to figure out why things aren't working. Search the list, read the 
vignettes, read the help pages.

Best,

Jim

> Find enriched regions.
>
> Any help for the above tasks is appreciated.
>
>
> Thanks
> Srikanth
>
>
>
>
>
> On Wed, Jun 19, 2013 at 7:06 PM, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Srinivas,
>
>
>     On 6/19/2013 5:21 AM, Srinivas Srikanth Manda wrote:
>
>         Hello Members,
>
>         I am using Minfi package to analyze 450k data. I have three
>         different
>         groups of samples and one common control. I did the
>         normalization and other
>         steps according to manual, but stuck at the differential
>         methylation
>         positions. When I use:
>
>         M<- getM(MSet.norm, type = "beta", betaThreshold = 0.001)
>         dmp1<- dmpFinder(M, pheno=pd$Sample_Group, type="categorical")
>
>         I want to get a table with probes and corresponding values in
>         each group.
>         the data.frame dmp1 does not tell me which group has what
>         value? How can I
>         do that?
>
>
>     It's not clear what you mean by 'probes and corresponding values
>     in each group'. I am not sure what a corresponding value is.
>
>     If I make the assumption that you want the coefficients from the
>     model fit, then you can do
>
>     design <- model.matrix(~pd$Sample_Group)
>     fit <- lmFit(M, design)
>
>     and then fit$coefficients has the coefficients. Or perhaps you
>     just want the methylation values? The M-values are in your M
>     matrix, and if you prefer betas, you can use getBeta(MSet.norm).
>
>     You might also just want the mean of each group. In which case it
>     would be easier to do
>
>     design <- model.matrix(~0+pd$Sample_Group)
>     fit <- lmFit(M, design)
>
>     and then fit$coefficients will contain the mean value for each
>     group, by probe.
>
>     Best,
>
>     Jim
>
>
>
>
>
>         sessionInfo()
>         R version 2.15.2 (2012-10-26)
>         Platform: x86_64-unknown-linux-gnu (64-bit)
>
>         locale:
>           [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>           [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>           [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
>           [7] LC_PAPER=C                LC_NAME=C
>           [9] LC_ADDRESS=C              LC_TELEPHONE=C
>         [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
>         attached base packages:
>         [1] stats     graphics  grDevices utils     datasets  methods
>           base
>
>         other attached packages:
>           [1] minfiData_0.3.1
>           [2] IlluminaHumanMethylation450kmanifest_0.4.0
>           [3] minfi_1.4.0
>           [4] Biostrings_2.26.3
>           [5] GenomicRanges_1.10.7
>           [6] IRanges_1.16.6
>           [7] reshape_0.8.4
>           [8] plyr_1.8
>           [9] lattice_0.20-15
>         [10] Biobase_2.18.0
>         [11] BiocGenerics_0.4.0
>
>         loaded via a namespace (and not attached):
>           [1] affyio_1.26.0         annotate_1.36.0      
>         AnnotationDbi_1.20.7
>           [4] beanplot_1.1          BiocInstaller_1.8.3   bit_1.1-10
>           [7] codetools_0.2-8       crlmm_1.16.9          DBI_0.2-7
>         [10] ellipse_0.3-8         ff_2.2-11             foreach_1.4.0
>         [13] genefilter_1.40.0     grid_2.15.2           iterators_1.0.6
>         [16] limma_3.14.4          MASS_7.3-23           Matrix_1.0-12
>         [19] matrixStats_0.8.1     mclust_4.1            multtest_2.14.0
>         [22] mvtnorm_0.9-9994      nor1mix_1.1-4        
>         oligoClasses_1.20.0
>         [25] parallel_2.15.2       preprocessCore_1.20.0
>         RColorBrewer_1.0-5
>         [28] RcppEigen_0.3.1.2.1   R.methodsS3_1.4.2     RSQLite_0.11.3
>         [31] siggenes_1.32.0       splines_2.15.2        stats4_2.15.2
>         [34] survival_2.37-4       tools_2.15.2          XML_3.96-1.1
>         [37] xtable_1.7-1          zlibbioc_1.4.0
>
>
>
>         Regards,
>         Srikanth
>
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>
>
>
> -- 
> Srinivas Srikanth Manda
> Ph.D. Student
> Institute of Bioinformatics
> Discoverer, 7th Floor,
> International Technology Park,
> Bangalore, India
> Mob:+919019114878

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099