[BioC] queryGEO fails on GDS files (GEO Datasets)
Sean Davis
sdavis2 at mail.nih.gov
Wed Jan 4 17:33:17 CET 2006
On 1/4/06 10:50 AM, "Sean Davis" <sdavis2 at mail.nih.gov> wrote:
> Peter,
>
> I have recently uploaded a new package to bioconductor called GEOquery. It
> is available as a development package
> (http://www.bioconductor.org/packages/bioc/1.8/html/GEOquery.html), but it
> doesn't depend on much, so should work with recent R and bioconductor
> releases. It is capable of downloading and parsing GDS, GSM, GPL, and GSE.
> (GSE download and parsing seems to be broken on windows, at least for some
> GSEs--working on that). After installing, you could do:
>
>> library(GEOquery)
> # the following takes about a minute or so....
>> gds813 <- getGEO('GDS813')
>
> And then to convert to an exprSet, simply do:
>
>> eset <- GDS2eSet(GDS,do.log2=TRUE)
Made a typo in the line above:
eset <- GDS2eSet(gds813,do.log2=TRUE)
Will make an exprSet including the sample information from the GDS that was
downloaded and parsed using getGEO above.
>> eset
> Expression Set (exprSet) with
> 22690 genes
> 20 samples
> phenoData object with 4 variables and 38 cases
> varLabels
> : sample
> : disease.state
> : tissue
> : description
>
> Sean
>
>
> On 1/4/06 10:27 AM, "Peter" <bioconductor-mailinglist at maubp.freeserve.co.uk>
> wrote:
>
>
>> Would it make more sense to provide to separate functions:
>>
>> Firstly, to download the file (dealing with all possible URLs) and if
>> need be decompress it.
See the function "getGEOfile" in the GEOquery package.
>> Secondly, to parse a GEO file from the provided handle/filename/url
>>
>> This makes sense for other large GEO files like the GPL annotation
>> files, as well as the GEO datasets (GDS files). It seems wasteful and
>> slow to download them fresh each time.
The getGEO function also includes a filename argument. The file given by
the filename will be parsed as a GEO file; .gz files are handled
appropriately as long as the file extension '.gz' is present.
Sean
More information about the Bioconductor
mailing list