[BioC] Missing probesets when creating Affymetrix GeneChip miRNA 4.0 CDF package using makecdfenv package

cstrato cstrato at aon.at
Wed Jan 15 16:44:10 CET 2014


Dear Jim,

Thank you for mentioning xps, but sadly I must admit that the miRNA 
arrays one of the few arrays which xps does not support (with the 
exception of miRNA 1.0) due to the non-standard way in which Affymetrix 
handles these arrays.

Best regards,
Christian


On 1/15/14 4:08 PM, James W. MacDonald wrote:
> Hi Lei,
>
> I doubt you did anything wrong, but without having either the cdf nor
> cel file in hand I can't say much else.
>
> You would probably be better served if you were to use
> pdInfoBuilder/oligo or xps to analyze this array. Affy doesn't make the
> cdf available publicly, and even if they were to do so it would likely
> be unsupported. They do however supply the bgp/clf/pgf files, which you
> could use to make a pd.mirna.4.0 package and then use oligo.
>
> You could also use xps, but I will let Christian Stratowa tell you what
> you need to do for that package.
>
> If you want, you can send me the cel file in an offline email and I will
> see about making a pd package.
>
> Best,
>
> Jim
>
>
> On 1/15/2014 1:15 AM, Lei Huang [guest] wrote:
>> Dear all,
>>
>> I am working on a set of Affymetrix GeneChip miRNA 4.0 microarray data
>> and would like to perform differential expression analysis using
>> Bioconductor packages. Since this is a fairly new platform, no CDF and
>> annotation packages are available in bioconductor repository at the
>> moment. Affymetrix folks kindly provided me miRNA 4.0 CDF file as well
>> as sample CEL data. So I desided to create a CDF package by my own
>> using make.cdf.package() from makecdfenv package. I was able to make
>> the package and install it without trouble. However, after I read the
>> raw CEL files and normalized the affybatch with vsnrma()/rma(), I
>> found the number of probesets is only 25065 while the number is 36249
>> in original Affymetrix miRNA 4.0 CDF file. I am aware that from
>> version 4, Affymetrix changed their naming convention for the probeset
>> IDs, but this shouldn't cause the problem of missing probesets. What I
>> did wrong? I would really appreciate if anyone could give me some
>> hints/advices on solving th!
> is
>>   problem.
>>
>> -Lei
>>
>> --
>> Lei Huang
>> Center for Research Informatics
>> Biological Science Division
>> University of Chicago
>> http://cri.uchicago.edu
>> --
>>
>> P.S. The following are the code and output from my R session:
>>
>>> setwd("~/Documents/Project/mirna/GeneChip 4-0 Array Sample Data")
>>> library(affy)
>>> library(makecdfenv)
>> Loading required package: affyio
>>> pkgpath <- tempdir()
>>> pname <-
>>> cleancdfname(whatcdf("20131118_Human-Brain-AM7962-130ng_rep1_(miRNA-4_0).CEL"))
>>>
>>> make.cdf.package("miRNA-4_0-st-v1.cdf",
>>> cdf.path="~/Documents/Project/mirna/miRNA-4_0-st-v1_CDF",
>> +                  compress=FALSE, species = "", packagename=pname,
>> package.path = pkgpath)
>> Reading CDF file.
>> Creating CDF environment
>> Wait for about 251
>> dots.............................................................................................................................................................................................................................................................
>>
>> Creating package in
>> /var/folders/rh/rrlg3bcs6kgcj89zm4mgjjxh0000gq/T//RtmpRos3Be/mirna40cdf
>>
>> README PLEASE:
>> A source package has now been produced in
>> /var/folders/rh/rrlg3bcs6kgcj89zm4mgjjxh0000gq/T//RtmpRos3Be/mirna40cdf.
>> Before using this package it must be installed via 'R CMD INSTALL'
>> at a terminal prompt (or DOS command shell).
>> If you are using Windows, you will need to get set up to install
>> packages.
>> See the 'R Installation and Administration' manual, specifically
>> Section 6 'Add-on Packages' as well as 'Appendix E: The Windows Toolset'
>> for more information.
>>
>> Alternatively, you could use make.cdf.env(), which will not require
>> you to install a package.
>> However, this environment will only persist for the current R session
>> unless you save() it.
>>
>> ## install the cdf package from shell
>> ## cd to mirna40cdf location
>> ## R CMD INSTALL mirna40cdf
>>
>>> library(limma)
>>> library(vsn)
>>> library(mirna40cdf)
>>>
>>> affybatch <- ReadAffy(filenames=list.files())
>>> affybatch at cdfName
>> [1] "miRNA-4_0"
>>
>> ## normalization
>>> eset.norm <- vsnrma(affybatch)
>> vsn2: 292681 x 8 matrix (1 stratum).
>> Please use 'meanSdPlot' to verify the fit.
>> Calculating Expression
>>
>> ## only 25,065 probesets, the original Affymetrix cdf file contains
>> 36,249 probesets
>>> dim(eset.norm)
>> Features  Samples
>>     25065        8
>>
>>
>>   -- output of sessionInfo():
>>
>>> sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets
>> methods   base
>>
>> other attached packages:
>> [1] mirna40cdf_1.38.0    AnnotationDbi_1.24.0 vsn_3.30.0
>> [4] limma_3.18.9         makecdfenv_1.38.0    affyio_1.30.0
>> [7] affy_1.40.0          Biobase_2.22.0       BiocGenerics_0.8.0
>>
>> loaded via a namespace (and not attached):
>>   [1] BiocInstaller_1.12.0  compiler_3.0.2        DBI_0.2-7
>>   [4] grid_3.0.2            IRanges_1.20.6        lattice_0.20-24
>>   [7] preprocessCore_1.24.0 RSQLite_0.11.4        stats4_3.0.2
>> [10] tools_3.0.2           zlibbioc_1.8.0
>>
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>



More information about the Bioconductor mailing list