[BioC] GEOquery on rawdata and processed data ?
Sean Davis
sdavis2 at mail.nih.gov
Tue Jul 3 22:29:10 CEST 2007
Alex Tsoi wrote:
> Thanks all of you for the information.
>
> However, as I mentioned in my previous emails, some GEO data (eg.
> GSM72287) has both the .CEL file and .EXP file, and I looked up their
> paper: http://www.ncbi.nlm.nih.gov/sites/entrez
> and the authors mentioned that they did put the processed data as .CEL
> and the raw as .EXP.
The .CEL files are, by definition, raw files. If a manuscript says
otherwise, then I think you should probably contact the author to
clarify the situation.
> I understand that I could first download the supplementary files
> manually from the GEO website, then input them as R object. But
> unfortunately, I am doing meta-analysis on cancer microarrays, so I
> would have to download 20 + datasets manually for getting the raw data
> . So I just wonder, in case the raw data is available in the GEO, is
> there any way I could parse that directly to R ?(since some of those
> have both processed and raw, but once parsed using the getGEO, only
> the processed is shown)
The link for the supplementary files is embedded in the GSE header
information, if available. You can certainly use R to download those
files and uncompress them. You will still need to make some decisions
about how you would like to treat these raw data after they are
downloaded. Since you are setting up to do a meta-analysis, presumably
you have thought a good deal about how to go about processing the raw
data and analyzing the results across datasets.
Sean
More information about the Bioconductor
mailing list