[BioC] Normalization of array data from GEO repository
Steve Lianoglou
mailinglist.honeypot at gmail.com
Tue Jul 7 19:59:19 CEST 2009
Hi,
On Jul 7, 2009, at 5:38 AM, Aleš Maver wrote:
> Hi all,
> I have obtained several GEO Series (GSE) entries from GEO repository
> using
> getGEO function (GEOquery package).
> Data obtained in this manner is stored in ExpressionSet class. The
> problem
> is I don't know how to perform quality control analyses and
> normalization
> procedures on ExpressionSet data, because functions like expresso
> (affy
> package) work only on AffyBatch classes. Is there anything I am
> missing?
Sorry, I've never used the GEOquery package before, so I can't speak
much to that, but I'd be surprised if there isn't an option to return
your results as an AffyBatch object, because I'd dare say that you can
get most of the data from geo in its raw format (eg, CEL file or
whatever).
> And- does anyone know whether data in GEO repository is already
> normalised
> or not?
It depends, sometimes you aren't given the raw files: sometimes the
data is from a custom array, or I've also seen some datasets provided
in the post-processed form (already MAS5 normalized, for example), but
it's been my experience that you can get the raw data for most of the
experiments you find there.
Also, for array quality assessment, look into the arrayQualityMetrics
package:
http://www.bioconductor.org/packages/release/bioc/html/arrayQualityMetrics.html
Hope that helps,
-steve
--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list