[BioC] makePDPackage problem? -- Affy promoter arrays
Mark Robinson
mrobinson at wehi.EDU.AU
Mon Aug 4 03:02:48 CEST 2008
Hi all.
This may in fact actually not be a problem, it may be something silly
that I'm doing. But, something strikes me as odd. Below is my
explanation.
I am working with the Affymetrix promoter tiling arrays. My starting
point is a BPMAP file, which you can get from Affy library bundle at:
http://www.affymetrix.com/products/arrays/specific/human_promoter.affx
... or I have also been using the re-worked BPMAP file you can get
from the people who developed MAT:
http://chip.dfci.harvard.edu/~wli/MAT/Download.htm
So, I use 'makePDPackage' to create a R package, along the lines of:
(I've just renamed the downloaded BPMAP file to have 'affy' or
'harvard' in the name of the file, so that I can remember how to tell
them apart)
>
makePDpackage
("Hs_PromPR_v02
-3_NCBIv36
.affy.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
affymetrix tiling
The package will be called pd.hs.prompr.v02.3.ncbiv36.affy
Array identified as having 914 rows and 914 columns.
Creating package in /export/share/disk501/lab0605/mrobinson/projects/
microarray/pd.hs.prompr.v02.3.ncbiv36.affy
>
makePDpackage
("Hs_PromPR_v01
-3_NCBIv36
.NR
.harvard.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
affymetrix tiling
The package will be called pd.hs.prompr.v01.3.ncbiv36.nr.harvard
Array identified as having 914 rows and 914 columns.
Creating package in /export/share/disk501/lab0605/mrobinson/projects/
microarray/pd.hs.prompr.v01.3.ncbiv36.nr.harvard
... then do R CMD INSTALL ... from the command prompt. One thing that
strikes me as odd is the fact that it recognizes it as having 914 rows
and columns. See below.
So, I read in the data for a single file and look at the raw data for
a particular X and Y location on the chip. And compare this to what I
get from 'readCel' in the affxparser package.
> rd<-read.celfiles("CEL/
test1.CEL",pkgname="pd.hs.prompr.v01.3.ncbiv36.nr.harvard")
Platform design info loaded.
The intensity matrix will require 35.79 MB of RAM.
> pd<-getPD(rd)
> length(pd$X)
[1] 4286817
> dim(rd)
Features Samples
4286817 1
> w<-which(pd$X==1344 & pd$Y==854)
> w
[1] 1267129
> exprs(rd)[w,]
[1] 123
> library(affxparser)
> x<-readCel("CEL/test1.CEL",readXY=TRUE)
> x$header[c("rows","cols")]
$rows
[1] 2166
$cols
[1] 2166
> w<-which(x$x==1344 & x$y==854)
> w
[1] 1851109
> x$intensities[w]
[1] 8074
So, this chip does have 2166 rows and columns, which could be
introducing problems in the indexing. I haven't dug any deeper on this.
Anyone know what is happening? Is this a problem in making the
package through 'makePDPackage', or do I misunderstand the
correspondence between the elements of a 'TilingFeatureSet' and the
corresponding 'platformDesign' object?
Thanks!
Mark
> sessionInfo()
R version 2.7.0 (2008-04-22)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE
=
en_US
.UTF
-8
;LC_NUMERIC
=
C
;LC_TIME
=
en_US
.UTF
-8
;LC_COLLATE
=
en_US
.UTF
-8
;LC_MONETARY
=
C
;LC_MESSAGES
=
en_US
.UTF
-8
;LC_PAPER
=
en_US
.UTF
-8
;LC_NAME
=
C
;LC_ADDRESS
=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] splines tools stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] makePlatformDesign_1.4.0
[2] affyio_1.8.0
[3] pd.hs.prompr.v01.3.ncbiv36.nr.harvard_1.4.0
[4] oligo_1.4.0
[5] oligoClasses_1.2.0
[6] AnnotationDbi_1.2.0
[7] preprocessCore_1.2.0
[8] RSQLite_0.6-9
[9] DBI_0.2-4
[10] Biobase_2.0.1
[11] affxparser_1.12.2
------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robinson at garvan.org.au
e: mrobinson at wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
More information about the Bioconductor
mailing list