[BioC] GEOquery on rawdata and processed data ?

Sean Davis sdavis2 at mail.nih.gov
Tue Jul 3 22:29:10 CEST 2007


Alex Tsoi wrote:
> Thanks all of you for the information.
>
> However, as I mentioned in my previous emails, some GEO data (eg. 
> GSM72287) has both the .CEL file and .EXP file, and I looked up their 
> paper: http://www.ncbi.nlm.nih.gov/sites/entrez
> and the authors mentioned that they did put the processed data as .CEL 
> and the raw as .EXP.

The .CEL files are, by definition, raw files.  If a manuscript says 
otherwise, then I think you should probably contact the author to 
clarify the situation.

> I understand that I could first download the supplementary files 
> manually from the GEO website, then input them as R object. But 
> unfortunately, I am doing meta-analysis on cancer microarrays, so I 
> would have to download 20 + datasets manually for getting the raw data 
> . So I just wonder, in case the raw data is available in the GEO, is 
> there any way I could parse that directly to R ?(since some of those 
> have both processed and raw, but once parsed using the getGEO, only 
> the processed is shown)

The link for the supplementary files is embedded in the GSE header 
information, if available.  You can certainly use R to download those 
files and uncompress them.  You will still need to make some decisions 
about how you would like to treat these raw data after they are 
downloaded.  Since you are setting up to do a meta-analysis, presumably 
you have thought a good deal about how to go about processing the raw 
data and analyzing the results across datasets.

Sean



More information about the Bioconductor mailing list