[Bioc-devel] Windows-only issue with downloading a Rdata file and loading it with R
Leonardo Collado Torres
lcollado at jhu.edu
Sun Jun 19 16:38:28 CEST 2016
Thanks Martin! Using mode='wb' solved the issue.
On Sat, Jun 18, 2016 at 10:41 AM, Martin Morgan
<martin.morgan at roswellpark.org> wrote:
> On 06/18/2016 12:58 AM, Leonardo Collado Torres wrote:
>>
>> Hi,
>>
>> I get the same error while hosting the data somewhere else or when using
>> RawGit's url. That is:
>>
>>> library('downloader')
>>> download('
>>
>> http://www.biostat.jhsph.edu/~lcollado/recount/metadata_clean_sra.Rdata',
>> destfile = 'test2.Rdata')
>>>
>>> load('tes2t.Rdata')
>>
>> Error: ReadItem: unknown type 50, perhaps written by later version of R
>>>
>>> download('
>>
>>
>> https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata',
>> destfile = 'test3.Rdata')
>>>
>>> load('test3.Rdata')
>>
>> Error: ReadItem: unknown type 50, perhaps written by later version of R
>>
>> Again, it only happens on Windows but not on the other OS. So it doesn't
>> look like a GitHub issue.
>
>
> use mode="wb" to download in binary mode.
>
> Martin
>
>>
>> Best,
>> Leo
>>
>>
>> On Fri, Jun 17, 2016 at 4:57 PM, Gabe Becker <becker.gabe at gene.com> wrote:
>>
>>> I wonder if raw only means "raw after line return munging"? can you
>>> attach
>>> the file that gets downloaded via email? (off list is fine)
>>>
>>> On Fri, Jun 17, 2016 at 1:44 PM, Leonardo Collado Torres
>>> <lcollado at jhu.edu
>>>>
>>>> wrote:
>>>
>>>
>>>> Hi,
>>>>
>>>> I'm trying to figure out what is going wrong with an error that pops
>>>> up on Windows only. It's currently the only error for a package that I
>>>> recently submitted to Bioc. The function is fairly simple: it
>>>> downloads a Rdata file from the web and loads it.
>>>>
>>>> If I try to download and load the file with R, the following error
>>>> occurs (only on Windows):
>>>>
>>>>
>>>>> library('downloader')
>>>>> download('
>>>>
>>>>
>>>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true',
>>>> destfile = 'test.Rdata')
>>>> trying URL '
>>>>
>>>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true
>>>> '
>>>> Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
>>>> downloaded 2.4 MB
>>>>
>>>>> load('test.Rdata')
>>>>
>>>> Error: ReadItem: unknown type 50, perhaps written by later version of R
>>>>>
>>>>> traceback()
>>>>
>>>> 1: load("test.Rdata")
>>>>>
>>>>> options(width = 120)
>>>>> devtools::session_info()
>>>>
>>>> Session info
>>>>
>>>> -----------------------------------------------------------------------------------------------------------
>>>> setting value
>>>> version R version 3.3.0 (2016-05-03)
>>>> system x86_64, mingw32
>>>> ui Rgui
>>>> language (EN)
>>>> collate English_United States.1252
>>>> tz America/New_York
>>>> date 2016-06-17
>>>>
>>>> Packages
>>>>
>>>> ---------------------------------------------------------------------------------------------------------------
>>>> package * version date source
>>>> devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
>>>> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
>>>> downloader * 0.4 2015-07-09 CRAN (R 3.3.0)
>>>> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
>>>> withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>>>>>
>>>>>
>>>>
>>>>
>>>> If I open the same url on my browser and manually download the file,
>>>> then everything works as shown below:
>>>>
>>>>> load('metadata_clean_sra.Rdata')
>>>>> metadata_clean
>>>>
>>>> Loading required package: S4Vectors
>>>> Loading required package: stats4
>>>> Loading required package: BiocGenerics
>>>> Loading required package: parallel
>>>> ## removed more output
>>>>
>>>>> options(width = 120)
>>>>> devtools::session_info()
>>>>
>>>> Session info
>>>>
>>>> -----------------------------------------------------------------------------------------------------------
>>>> setting value
>>>> version R version 3.3.0 (2016-05-03)
>>>> system x86_64, mingw32
>>>> ui Rgui
>>>> language (EN)
>>>> collate English_United States.1252
>>>> tz America/New_York
>>>> date 2016-06-17
>>>>
>>>> Packages
>>>>
>>>> ---------------------------------------------------------------------------------------------------------------
>>>> package * version date source
>>>> BiocGenerics * 0.19.1 2016-06-17 Bioconductor
>>>> devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
>>>> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
>>>> IRanges * 2.7.2 2016-06-07 Bioconductor
>>>> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
>>>> S4Vectors * 0.11.3 2016-06-03 Bioconductor
>>>> withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>>>>>
>>>>> print(object.size(metadata_clean), units = 'Mb')
>>>>
>>>> 30.5 Mb
>>>>
>>>> The object itself is a DataFrame and was created using R 3.3.1 with
>>>> S4Vectors version 0.11.4. I get the same error if using a Unix machine
>>>> I re-save the data using R 3.3.0 (with S4Vectors from Bioc-release).
>>>>
>>>> Some google leads are "corrupt file" or something about a hidden
>>>> session Rdata file. But from the manual test, everything looks line.
>>>> Unless downloader::download() (or alternatively utils::download.file()
>>>> ) is corrupting the file.
>>>>
>>>>
>>>> An option would be to include the data in the package, but I'd like to
>>>> avoid doing so to minimize the package size. It already has a big
>>>> data.frame that is necessary for the package to work. This short
>>>> function is there for convenience.
>>>>
>>>> Best,
>>>> Leo
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>>
>>>
>>>
>>> --
>>> Gabriel Becker, Ph.D
>>> Associate Scientist
>>> Bioinformatics and Computational Biology
>>> Genentech Research
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
More information about the Bioc-devel
mailing list