[BioC] Importing data from GEOquery

Sean Davis sdavis2 at mail.nih.gov
Wed Jun 25 22:16:38 CEST 2008


On Wed, Jun 25, 2008 at 3:57 PM, Vincent Carey 525-2265
<stvjc at channing.harvard.edu> wrote:
> please read the posting guide before posting.  you should
> identify your system and version of R.
>
> i can verify that this problem occurs on windows R 2.7.0
> and on mac osx R 2.8.0 r45836 with RCurl 0.8-3 and GEOquery 2.5.0
>
> however example(getGEO) works on those systems, so i wonder
> if the problem is with the GSM3612 files rather than GEOquery.
> we have no way of verifying that the files on GEO are parseable
> short of trying to read them.
>
> ---
> Vince Carey, PhD
> Assoc. Prof Med (Biostatistics)
> Harvard Medical School
> Channing Laboratory - ph 6175252265 fa 6177311541
> 181 Longwood Ave Boston MA 02115 USA
> stvjc at channing.harvard.edu
>
> On Wed, 25 Jun 2008, Kini, Aditya M wrote:
>
>> Hi,
>>
>> I am repeatedly getting this error message when I try to import a file from GEO. Here is the code:
>>
>> > gsm.1 <- getGEO("GSM3612")
>> trying URL 'http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=self&acc=GSM3612&form=text&view=full'
>> Content type 'geo/text' length unknown
>> opened URL
>> downloaded 1.5 Mb
>>
>> File stored at:
>> C:\Users\Aditya\AppData\Local\Temp\Rtmp5ZePB5/GSM3612.soft
>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>>   scan() expected 'a real', got '!sample_table_end'
>> > gsm.1
>> Error: object "gsm.1" not found
>>
>> Please let me know what the problem is.

Thanks, Vince, for checking into the problem.  It looks like those
files have a multibyte character (that was supposed to be a degree
symbol, from the looks of it) that is problematic in at least some
locales.  I don't know of an easy way to fix the problem, as the files
at NCBI are supposed to all be in the same character encoding (UTF-8).
 If anyone knows of a solution, let me know.

Sean



More information about the Bioconductor mailing list