[BioC] makePDPackage problem? -- Affy promoter arrays

Mark Robinson mrobinson at wehi.EDU.AU
Mon Aug 4 03:02:48 CEST 2008


Hi all.

This may in fact actually not be a problem, it may be something silly  
that I'm doing.  But, something strikes me as odd.  Below is my  
explanation.


I am working with the Affymetrix promoter tiling arrays.  My starting  
point is a BPMAP file, which you can get from Affy library bundle at:

http://www.affymetrix.com/products/arrays/specific/human_promoter.affx

... or I have also been using the re-worked BPMAP file you can get  
from the people who developed MAT:

http://chip.dfci.harvard.edu/~wli/MAT/Download.htm

So, I use 'makePDPackage' to create a R package, along the lines of:

(I've just renamed the downloaded BPMAP file to have 'affy' or  
'harvard' in the name of the file, so that I can remember how to tell  
them apart)

 >  
makePDpackage 
("Hs_PromPR_v02 
-3_NCBIv36 
.affy.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
affymetrix tiling
The package will be called pd.hs.prompr.v02.3.ncbiv36.affy
Array identified as having 914 rows and 914 columns.
Creating package in /export/share/disk501/lab0605/mrobinson/projects/ 
microarray/pd.hs.prompr.v02.3.ncbiv36.affy

 >  
makePDpackage 
("Hs_PromPR_v01 
-3_NCBIv36 
.NR 
.harvard.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
affymetrix tiling
The package will be called pd.hs.prompr.v01.3.ncbiv36.nr.harvard
Array identified as having 914 rows and 914 columns.
Creating package in /export/share/disk501/lab0605/mrobinson/projects/ 
microarray/pd.hs.prompr.v01.3.ncbiv36.nr.harvard

... then do R CMD INSTALL ... from the command prompt.  One thing that  
strikes me as odd is the fact that it recognizes it as having 914 rows  
and columns.  See below.

So, I read in the data for a single file and look at the raw data for  
a particular X and Y location on the chip.  And compare this to what I  
get from 'readCel' in the affxparser package.


 > rd<-read.celfiles("CEL/ 
test1.CEL",pkgname="pd.hs.prompr.v01.3.ncbiv36.nr.harvard")
Platform design info loaded.
The intensity matrix will require 35.79 MB of RAM.
 > pd<-getPD(rd)
 > length(pd$X)
[1] 4286817
 > dim(rd)
Features  Samples
  4286817        1
 > w<-which(pd$X==1344 & pd$Y==854)
 > w
[1] 1267129
 > exprs(rd)[w,]
[1] 123


 > library(affxparser)
 > x<-readCel("CEL/test1.CEL",readXY=TRUE)
 > x$header[c("rows","cols")]
$rows
[1] 2166

$cols
[1] 2166
 > w<-which(x$x==1344 & x$y==854)
 > w
[1] 1851109
 > x$intensities[w]
[1] 8074

So, this chip does have 2166 rows and columns, which could be  
introducing problems in the indexing.  I haven't dug any deeper on this.

Anyone know what is happening?  Is this a problem in making the  
package through 'makePDPackage', or do I misunderstand the  
correspondence between the elements of a 'TilingFeatureSet' and the  
corresponding 'platformDesign' object?

Thanks!
Mark

 > sessionInfo()
R version 2.7.0 (2008-04-22)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE 
= 
en_US 
.UTF 
-8 
;LC_NUMERIC 
= 
C 
;LC_TIME 
= 
en_US 
.UTF 
-8 
;LC_COLLATE 
= 
en_US 
.UTF 
-8 
;LC_MONETARY 
= 
C 
;LC_MESSAGES 
= 
en_US 
.UTF 
-8 
;LC_PAPER 
= 
en_US 
.UTF 
-8 
;LC_NAME 
= 
C 
;LC_ADDRESS 
=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
  [1] makePlatformDesign_1.4.0
  [2] affyio_1.8.0
  [3] pd.hs.prompr.v01.3.ncbiv36.nr.harvard_1.4.0
  [4] oligo_1.4.0
  [5] oligoClasses_1.2.0
  [6] AnnotationDbi_1.2.0
  [7] preprocessCore_1.2.0
  [8] RSQLite_0.6-9
  [9] DBI_0.2-4
[10] Biobase_2.0.1
[11] affxparser_1.12.2


------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robinson at garvan.org.au
e: mrobinson at wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852



More information about the Bioconductor mailing list