[BioC] minfi release 1.8
swarmqq at gmail.com
swarmqq at gmail.com
Fri May 23 21:16:36 CEST 2014
Thanks Tim for sharing this.
I think it is very useful. But I kind of stuck at generating
genomeRatioSet, it has error message.
*> gRatioSet <- mapToGenome(ratioSet, mergeManifest = TRUE)Error in
.Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") :
solving row 1: range cannot be determined from the supplied arguments (too
many NAs)*I am not sure what's wrong? If you happen to know, could you
please share it with me?
Thanks a lot for the help.
Emma
On Saturday, February 1, 2014 2:06:59 PM UTC-5, Tim Triche, Jr. wrote:
>
> There are at least 3 different ways you could go about this; Tiffany
> Morris' probe variant annotation package accompanying ChAMP is probably the
> most comprehensive, while (I claim) the UCSC common variant track is the
> most trivial and the SNPlocs packages are the most intensive. Let's assume
> your data is in a GenomicRatioSet named grSet for all examples.
>
> Writing and testing out the code for this took a little while so I do hope
> you'll work through each example.
>
> All things considered, I'd recommend option #1, but it would be even
> better if the maintainers tweaked a few settings so that
> bsseq::data.frame2GRanges(probe.450K.VCs.af) was more trivial. (Also:
> consider merging $diVC.com1pop.F and $diVC.com1pop.R, making everything
> that can reasonably be so boolean, etc... GRanges are so much easier to
> work with than data.frames when dealing with genomic coordinates)
>
> You'll need to supply your own grSet, but if you didn't have one, I don't
> imagine you'd have asked :-)
> I have provided the dimensions (rows x columns) of the filtered grSets for
> comparison in each case. You will have to decide what degree of
> conservatism is most appropriate for your experiment. Read the docs!
>
>
> 1) Use the ProbeVariants package alliuded to above. This is
> comprehensive, but a bit of a PITA.
>
> require(minfi)
> grSet
> ## class: GenomicRatioSet
> ## dim: 485512 34
>
> require(Illumina450ProbeVariants.db)
> ?probe.450K.VCs.af ## read the docs!
> data(probe.450K.VCs.af)
>
> commonSnpInProbe <- rownames(probe.450K.VCs.af)[ probe.450K.VCs.af$probe50VC.com1pop
> > 0 ]
> grSet.noCommonSNPsInProbes <- grSet[
> setdiff(rownames(grSet), commonSnpInProbe), ]
> grSet.noCommonSNPsInProbes
> ## class: GenomicRatioSet
> ## dim: 308901 34
>
> commonVariantsEitherStrand <- !is.na(probe.450K.VCs.af$diVC.com1pop.F)|!
> is.na(probe.450K.VCs.af$diVC.com1pop.R)
> CpGsWithCommonVariants <- rownames(probe.450K.VCs.af)[ commonVariantsEitherStrand
> ]
> grSet.noCommonVariantsAtCpGs <- grSet[
> setdiff(rownames(grSet),CpGsWithCommonVariants), ]
> grSet.noCommonVariantsAtCpGs
> ## class: GenomicRatioSet
> ## dim: 445502 34
>
> Consult the documentation for finer control over the process (e.g.
> specific populations, etc.).
>
> ?probe.450K.VCs.af ## read the docs again
>
> It would not break my heart if the maintainers turned the data.frame here
> into a GRanges; a few small changes followed by data.frame2GRanges() would
> do the trick. But still, it's a very handy compilation.
> See below for the reason I prefer GRanges (in short, because I'm impatient
> and don't like debugging).
>
>
> 2) Use the FDb package (> 1% MAF across all populations, dbSNP build 135).
> This is a one-liner.
>
> require(minfi)
> grSet
> ## class: GenomicRatioSet
> ## dim: 485512 34
>
> require(FDb.UCSC.snp135common.hg19)
> commonSNPs <- features(FDb.UCSC.snp135common.hg19)
>
> ## With a GRanges object, the previous rigamarole becomes a one-liner:
> grSet.noCommonSNPsAtCpGs <- grSet[ grSet %outside% commonSNPs, ]
>
> grSet.noCommonSNPsAtCpGs
> ## class: GenomicRatioSet
> ## dim: 478385 34
>
> This was my workaround from 2-3 years ago to permit masking of TCGA Level
> 3 methylation data. It still works and it still works well, but it's been
> superseded (IMHO) by more flexible approaches like #1.
>
>
> 3) use SNPlocs (all SNPs in dbSNP; mildly annoying complication with 'ch'
> vs. 'chr' seqlevels)
>
> require(minfi)
> grSet
> ## class: GenomicRatioSet
> ## dim: 485512 34
>
> ## work around the annoyance:
> chroms <- seqlevels(grSet)
> names(chroms) <- chroms
> chroms <- gsub('chr', 'ch', chroms)
> require(SNPlocs.Hsapiens.dbSNP.20120608)
> SNPs.byChr <- GRangesList(lapply(chroms, getSNPlocs, as.GRanges=TRUE))
> ## time passes...
> seqlevels(SNPs.byChr) <- gsub('ch', 'chr', seqlevels(SNPs.byChr)) ## back
> to normal
> genome(SNPs.byChr) <- 'hg19' ## GRCh37.p5 coordinates are identical to
> hg19 save for chrMT
>
> ## once the above hoops have been jumped through, it's back to one-liners:
> grSet.noSNPsAtCpGs <- grSet[ grSet %outside% SNPs.byChr, ]
> grSet.noSNPsAtCpGs
> ## class: GenomicRatioSet
> ## dim: 444722 34
>
> This used to be documented in the minfi code/manual somewhere, though I
> don't know if it still is.
>
>
> Statistics is the grammar of science.
> Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>
>
>
> On Fri, Jan 31, 2014 at 1:06 PM, C T <off... at gmail.com <javascript:>>wrote:
>
>> Any tutorial on how to remove probes that contains SNPs?
>>
>> On Tuesday, November 12, 2013 7:12:46 PM UTC-5, Kasper Hansen wrote:
>> >
>> > As part of Bioconductor 2.13, we have released minfi 1.8.x. Due to a
>> > number of last minute errors, the recommended version is 1.8.3 (or
>> bigger).
>> >
>> > Users may find that their old objects cannot be linked to annotation.
>> > Please run
>> > OBJECT = updateObject(OBJECT)
>> > to fix this.
>> >
>> > Highlights include
>> > * preprocessingQuantile(): an independent implementation of the same
>> ideas
>> > as in Tost et al.
>> > * bumphunter() for finding DMRs
>> > * blockFinder() for finding large hypo-methylated blocks on the 450k
>> array.
>> > * estimateCellCounts() for estimating cell type composition for whole
>> > blood samples. The function can be extended to work on other types of
>> > cells, provided suitable flow sorted data is available.
>> > * the annotation now includes SNP annotation for dbSNP v132, 135 and
>> 137,
>> > independently annotated at JHU.
>> > * getSex(): you can now get sex repeatedly, irrespective of relationship
>> > status.
>> > * minfiQC: find and remove outlier samples based on a sample QC criteria
>> > we have found effective.
>> >
>> > Unfortunately, none of these handy changes are yet detailed in the
>> > vignette; we are working on this.
>> >
>> > A manuscript is in review detailing most of these functions.
>> >
>> > Full NEWS below
>> >
>> > Best,
>> > Kasper D Hansen
>> >
>> > o Added getMethSignal(), a convenience function for programming.
>> >
>> > o Changed the argument name of "type" to "what" for
>> getMethSignal().
>> >
>> > o Added the class "RatioSet", like "GenomicRatioSet" but without
>> the
>> > genome information.
>> >
>> > o Bugfixes to the "GenomicRatioSet()" constructor.
>> >
>> > o Added the method ratioConvert(), for converting a "MethylSet"
>> to a
>> > "RatioSet" or a "GenomicMethylSet" to a "GenomicRatioSet".
>> >
>> > o Fixed an issue with GenomicMethylSet() and GenomicRatioSet()
>> caused
>> > by a recent change to a non-exported function in the
>> GenomicRanges
>> > package (Reported by Gustavo Fernandez Bayon <gba... at gmail.com
>> <javascript:>
>> > >).
>> >
>> > o Added fixMethOutliers for thresholding extreme observations in
>> the
>> > [un]methylation channels.
>> >
>> > o Added getSex, addSex, plotSex for estimating sex of the samples.
>> >
>> > o Added getQC, addQC, plotQC for a very simple quality control
>> > measure.
>> >
>> > o Added minfiQC for a one-stop function for quality control
>> measures.
>> >
>> > o Changed some verbose=TRUE output in various functions.
>> >
>> > o Added preprocessQuantile.
>> >
>> > o Added bumphunter method for "GenomicRatioSet".
>> >
>> > o Handling signed zero in minfi:::.digestMatrix which caused unit
>> > tests to fail on Windows.
>> >
>> > o addSex and addQC lead to sampleNames() being dropped because of
>> a
>> > likely bug in cbind(DataFrame, DataFrame). Work-around has been
>> > implemented.
>> >
>> > o Re-ran the test data generator.
>> >
>> > o Fixed some Depends and Imports issues revealed by new features
>> of R
>> > CMD check.
>> >
>> > o Added blockFinder and cpgCollapse.
>> >
>> > o (internal) added convenience functions for argument checking.
>> >
>> > o Exposed and re-wrote getAnnotation().
>> >
>> > o Changed getLocations() from being a method to a simple function.
>> > Arguments have been removed (for example, now the function
>> always
>> > drops non-mapping loci).
>> >
>> > o Implemented getIslandStatus(), getProbeType(), getSnpInfo() and
>> > addSnpInfo(). The two later functions retrieve pre-computed SNP
>> > overlaps, and the new annotation object includes SNPs based on
>> > dbSNP 137, 135 and 132.
>> >
>> > o Changed the IlluminaMethylatioAnnotation class to now include
>> > genomeBuild information as well as defaults.
>> >
>> > o Added estimateCellCounts for deconvolution of cell types in
>> whole
>> > blood. Thanks to Andrew Jaffe and Andres Houseman.
>> >
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Biocon... at r-project.org <javascript:>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
More information about the Bioconductor
mailing list