[BioC] CDF for Mouse gene 1.0 ST

Henrik Bengtsson hb at stat.berkeley.edu
Mon Nov 19 17:33:16 CET 2007


Hi Michal.

On Nov 19, 2007 4:12 AM, Michal Okoniewski <MOkoniewski at picr.man.ac.uk> wrote:
> Hi Henrik, Marco & all
>
> The issue with CDF files for exon arrays it is not that obvious.
> They are really "unsupported" and it means that they include "probesets"
> of background probes in the form of loooong ones and single-probe - so
> they
> cannot be used in BioConductor. I have done some manual editing of the
> CDFs
> removing background and controls (http://xmap.picr.man.ac.uk/cdf)
> so they may be used with RMA and plier.

For aroma.affymetrix,

  http://groups.google.com/group/aroma-affymetrix/,

a similar cleanup of exon CDFs has been made (Ken Simpson, WEHI).  One
reason was that there were some extremely large CDF units ("probe
sets") and fitting these in a probe-level model such as the
log-additive model (in RMA) would take exponentially(?) long time.  In
aroma.affymetrix one can easily exclude these from the fit, but if
included, these few units would take >99% of the fitting time.  On the
above group page there are some notes, graphs and discussion on this,
and I know that Elizabeth Purdom (UC Berkeley) and Mark Robinson
(WEHI) are working updated versions of some of the exon CDFs [Both
BCC:ed].

The above modifications are stored as valid CDF files, because
aroma.affymetrix works directly with CDF files.  Do you in your clean
up process also create an intermediate modified CDF before creating
the R annotation package?  If so, it sounds like effort can shared.
If not, does our R annotation packages contain enough information to
write them out as *valid* CDFs?

Cheers

Henrik

>
> Having a quick look into the "unsupported CDF" for Mouse Gene 1.0ST
> array,
> I see the same problem - control/background probes groupped in huge
> "probesets"
> that make the CDF useless for summarization without painstaking
> preprocessing/cleaning.
> But the program that I used to clean exon CDFs does not work for gene
> CDFS :|
>
> So Marco - when you find the way around the Mouse Gene CDF  - let us
> know..
>
> Cheers & saluti,
> Michal
>
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Henrik
> Bengtsson
> Sent: 17 November 2007 16:34
> To: Marco Fabbri
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] CDF for Mouse gene 1.0 ST
>
> Hi,
>
> you will in (almost?) all case find links to annotation files, sample
> data etc for Affymetrix arrays at the corresponding "Support Materials"
> page.  For chip types where Affymetrix support CDF files you'll find the
> CDF files within a large zipped "Library File".  For exon arrays,
> Affymetrix do not support CDF files, but will often still provide one
> separately (often in an ASCII format), as is also the case for the Mouse
> Gene 1.0 ST array.
>
> Have a look at
>
>
> http://www.affymetrix.com/support/technical/byproduct.affx?product=mogen
> e-1_0-st-v1
>
> and you'll see there is a CDF file.
>
> FYI, you can convert a CDF in an ASCII format into a binary format using
> convertCdf() in affxparser ***and that is highly recommended*** (for
> speed and memory reasons).
>
> /Henrik
>
> On Nov 17, 2007 5:47 AM, Marco Fabbri <fabbri.marco at gmail.com> wrote:
> >  I was looking for a cdf file
> >
> > Marco
> >
> >
> >  Hi
> > Marco,
> > Are these not what you want?
> >
> >    from here:
> >
> [1]http://www.affymetrix.com/support/technical/byproduct.affx?product=mo
> gene
> >    -1_0-st-v1
> >    []  [2] MoGene-1_0-st-v1 Annotations CSV README (26 KB, 9/5/07)
> >    []   [3]  MoGene-1_0-st-v1 Transcript Cluster Annotations, CSV (11
> MB,
> >    9/17/07)
> >    Pingzhao
> >
> >
> > --
> > ---------------------------------------
> > Marco Fabbri
> > Istituto Clinico Humanitas
> > via Manzoni, 56
> > 20089 Rozzano (Mi)
> > Tel. 028224 5130
> > Fax 028224 5101
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --------------------------------------------------------
>
>
> This email is confidential and intended solely for the...{{dropped:4}}



More information about the Bioconductor mailing list