[BioC] IlluminaHumanMethylation450k.db: Missing probes in IlluminaHumanMethylation450kPROBELOCATION function?
James W. MacDonald
jmacdon at uw.edu
Mon Jul 1 20:37:54 CEST 2013
Hi Simone,
On 7/1/2013 1:35 PM, Simone wrote:
> Hello!
>
> A little question: I've got a table with beta values obtained by
> Illumina's 450K BeadChip microarray and want to know for each probe
> where in the gene it is located (mainly promoter region or gene body).
> I found the IlluminaHumanMethylation450kPROBELOCATION function in the
> IlluminaHumanMethylation450k.db package which seems to do what I want,
> but not for all the probes.
> In detail, I have got data for 473,029 probes, but the function only
> returns values for 354,770 unique probes, so 118,259 ones are missing.
> Is this expected behaviour? Should I use another package to get
> location information for all the probes contained in my file?
You would be better off using the FDb.InfiniumMethylation.hg19 package,
which not only has all the locations for the probes, but also has them
in a more useful format.
> library(FDb.InfiniumMethylation.hg19)
> x <- get450k()
Warning message:
In if (is.na(genome(GR))) { :
the condition has length > 1 and only the first element will be used
> x
GRanges with 485577 ranges and 7 metadata columns:
seqnames ranges strand | addressA addressB
channel
<Rle> <IRanges> <Rle> | <Rle> <Rle> <Rle>
cg13869341 chr1 [15865, 15866] * | 62703328
16661461 Red
cg14008030 chr1 [18827, 18828] * | 27651330 <NA> Both
cg12045430 chr1 [29407, 29408] * | 25703424
34666387 Red
cg20826792 chr1 [29425, 29426] * | 61731400
14693326 Red
cg00381604 chr1 [29435, 29436] * | 26752380
50693408 Red
You can do all kinds of cool things with a GRanges object that you
cannot do with simple location data. But this comes at a cost of
complexity, so you will need to do some reading. I would recommend at a
minimum that you read the vignettes for GenomicFeatures, and look at the
help page for this package (?FDb.InfiniumMethylation.hg19).
And the current build is based on only hg19 (GRCh37), so there is no
issue of which build you are getting.
Best,
Jim
>
> And secondly: As the IlluminaHumanMethylation450k.db package seems to
> be deprecated, on which build of the genome is the
> IlluminaHumanMethylation450kPROBELOCATION information based? Because I
> see that in the package there are some separate functions for build 36
> respectively 37, but in the help of the function in question I can
> only find the information that mappings are based on data of Illumina
> from January 2011 (which should then be GRCh37, the build I have to
> work with in this case, but however I am not completely sure as it
> doesn't say it clearly).
>
> Best regards,
> Simone
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> LC_TIME=en_US.UTF-8
> [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets
> methods base
>
> other attached packages:
> [1] sqldf_0.4-6.4 RSQLite.extfuns_0.0.1
> [3] chron_2.3-43 gsubfn_0.6-5
> [5] proto_0.3-10 RColorBrewer_1.0-5
> [7] illuminaHumanv2.db_1.18.0 IlluminaHumanMethylation450k.db_2.0.7
> [9] IlluminaHumanMethylation27k.db_1.4.7 org.Hs.eg.db_2.9.0
> [11] RSQLite_0.11.4 DBI_0.2-7
> [13] AnnotationDbi_1.22.6 affy_1.38.1
> [15] Biobase_2.20.0 BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.28.0 AnnotationForge_1.2.1 BiocInstaller_1.10.2
> IRanges_1.18.1
> [5] preprocessCore_1.22.0 stats4_3.0.1 tcltk_3.0.1
> tools_3.0.1
> [9] zlibbioc_1.6.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list