[BioC] GEOquery::parseGEO throws error reading file
Gad Abraham
gabraham at csse.unimelb.edu.au
Sat Aug 2 05:28:06 CEST 2008
Sean Davis wrote:
> On Sun, Jul 27, 2008 at 11:44 PM, Gad Abraham
> <gabraham at csse.unimelb.edu.au> wrote:
>> Hi,
>>
>> I'm using GEOquery 2.4.1 to read an NCBI GEO file
>> ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE4284/GSE4284_series_matrix.txt.gz
>> but parseGEO throws an error. The switch argument evaluates to "0", which
>> doesn't match alternative, so it tries to match on the last empty argument
>> and fails. I don't know if this is related to the warnings; the file
>> contains text such as manufacturer\xa1\xafs which may not parse correctly.
>>
>> Below is the output.
>>
>> Thanks for any advice,
>> Gad
>>
>>> g <- getGEO(filename="GSE4284_series_matrix.txt")
>
> Hi, Gad. The filename argument does not yet take GSE series matrix
> files as an argument. I have a couple of changes to make with GSE
> series matrix handling and adding file-based parsing is one of them.
>
> Sean
Hi Sean,
Is this also the reason for scan() failing when using the GEO name
instead of the filename? (See below)
Thanks,
Gad
> g <- getGEO("GSE4284")
Found 1 file(s)
GSE4284_series_matrix.txt.gz
trying URL
'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE4284/GSE4284_series_matrix.txt.gz'
ftp data connection made, file length 4137302 bytes
opened URL
==================================================
downloaded 3.9 Mb
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings, :
scan() expected 'a real', got '"Schizosaccharomycespombe"'
In addition: Warning message:
In grep("^!Sample_", a, ignore.case = TRUE) :
input string 1 is invalid in this locale
> sessionInfo()
R version 2.7.1 (2008-06-23)
x86_64-pc-linux-gnu
locale:
LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] tools stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GEOquery_2.4.1 RCurl_0.9-4 Biobase_2.0.1
--
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham
More information about the Bioconductor
mailing list