[BioC] Affymetrix exon arrays?
hb at maths.lth.se
Thu Jul 27 23:00:20 CEST 2006
A comment: For the advanced user, the affxparser package is a good
start here. It is memory efficient and fast.
I don't work with exon arrays myself, but I know that at least one
person used the affxparser package to read exon CDF and CEL files, and
that without problems. Note: if you can get hold of binary CDF files,
that is *much* faster than ASCII CDF files. Same is true for CEL
Typically you do not have to read all of the data in at once, but only
a subsets, which is supported by affxparser.
With readCel() you have access to the probe-level data either ordered
from top-left corner to the bottom-right corner of the array (ordered
by (x,y)). This way you'll be able access data so you can normalize
With readCelUnits() you have access to the probe-level data ordered in
probesets as defined by the CDF (now I don't know how probesets are
defined on exon arrays). This allows you to sumarize data across
arrays without having to load all of the data into memory at once.
FYI: I'm working on a package (aroma.affymetrix) that among other
things allow s you to (quantile) normalize virtually any number of
arrays, e.g. I normalized the 90 CEPH 100K SNP with <150Mb RAM. The
idea is to work with (CEL) files directly (utilizing affxparser)
without reading everything into memory (at the same time). If I find
the time (and a poster spot) I'll try to prepare a poster on this for
the Bioconductor meeting in Seattle, if you happen to be there. If no
poster, just grab me there and I'll show you on my laptop.
On 4/10/06, Johannes Rainer <johannes.rainer at tcri.at> wrote:
> Dear all,
> actually i have also the same problem,
> my server runs since last thursday trying to make a cdf package. currently i
> use the affymetrix ExACT software to normalize the exon data. as far as i
> have seen the ExACT scripts are perl scripts which compile and run smoothly
> in unix (we had problems running the precompiled versions on windows, so i
> compiled them from the source in linux).
> so currently i use ExACT for the normalization (quantile) and summarization
> (RMA, using just the PM) and analyze the normalized data in R
> best, jo
> On 4/8/06, Michael Seewald <mseewald at gmail.com> wrote:
> > Dear all,
> > Is it possible to analyze Affymetrix exon arrays with R/Bioconductor? I
> > tried to generate a cdf environment with makecdfenv (as suggested by
> > James),
> > however the command never finished. The R process grows until it takes
> > about
> > 8 GB of RAM, then it is stuck.
> > I am grateful for any help or advice.
> > Best wishes,
> > Michael
> > On 11/23/05, James W. MacDonald <jmacdon at med.umich.edu> wrote:
> > >
> > > Natalia Becker wrote:
> > > > I have just started working with the GeneChip(r) Human Exon 1.0 ST
> > Array
> > > (
> > > > v2 release version of the library files) from Affymetrix.
> > >
> > > > Unfortunately the R package "affy" doesn't accept the .CLF and .PGF
> > > files.
> > > >
> > > > Could you send me the HuEx-1_0-st-v2.cdf file or show me the way how I
> > > can
> > > > create the CDF file by my own?
> > >
> > > You can use make.cdf.package() or make.cdf.env() in the makecdfenv
> > > package.
> > >
> > > Best,
> > > Jim
> > >
> > --
> > Dr. Michael Seewald
> > Bioinformatics
> > Bayer HealthCare AG
> > [[alternative HTML version deleted]]
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> Johannes Rainer, Msc
> Tyrolean Cancer Research Institute
> Innrain 66, 6020 Innsbruck, Austria
> Tel.: +43 512 570485 15
> Email: johannes.rainer at tcri.at
> johannes.rainer at tugraz.at
> [[alternative HTML version deleted]]
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor