[Bioc-devel] Windows-only issue with downloading a Rdata file and loading it with R
Martin Morgan
martin.morgan at roswellpark.org
Sat Jun 18 16:41:40 CEST 2016
On 06/18/2016 12:58 AM, Leonardo Collado Torres wrote:
> Hi,
>
> I get the same error while hosting the data somewhere else or when using
> RawGit's url. That is:
>
>> library('downloader')
>> download('
> http://www.biostat.jhsph.edu/~lcollado/recount/metadata_clean_sra.Rdata',
> destfile = 'test2.Rdata')
>> load('tes2t.Rdata')
> Error: ReadItem: unknown type 50, perhaps written by later version of R
>> download('
> https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata',
> destfile = 'test3.Rdata')
>> load('test3.Rdata')
> Error: ReadItem: unknown type 50, perhaps written by later version of R
>
> Again, it only happens on Windows but not on the other OS. So it doesn't
> look like a GitHub issue.
use mode="wb" to download in binary mode.
Martin
>
> Best,
> Leo
>
>
> On Fri, Jun 17, 2016 at 4:57 PM, Gabe Becker <becker.gabe at gene.com> wrote:
>
>> I wonder if raw only means "raw after line return munging"? can you attach
>> the file that gets downloaded via email? (off list is fine)
>>
>> On Fri, Jun 17, 2016 at 1:44 PM, Leonardo Collado Torres <lcollado at jhu.edu
>>> wrote:
>>
>>> Hi,
>>>
>>> I'm trying to figure out what is going wrong with an error that pops
>>> up on Windows only. It's currently the only error for a package that I
>>> recently submitted to Bioc. The function is fairly simple: it
>>> downloads a Rdata file from the web and loads it.
>>>
>>> If I try to download and load the file with R, the following error
>>> occurs (only on Windows):
>>>
>>>
>>>> library('downloader')
>>>> download('
>>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true',
>>> destfile = 'test.Rdata')
>>> trying URL '
>>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true
>>> '
>>> Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
>>> downloaded 2.4 MB
>>>
>>>> load('test.Rdata')
>>> Error: ReadItem: unknown type 50, perhaps written by later version of R
>>>> traceback()
>>> 1: load("test.Rdata")
>>>> options(width = 120)
>>>> devtools::session_info()
>>> Session info
>>> -----------------------------------------------------------------------------------------------------------
>>> setting value
>>> version R version 3.3.0 (2016-05-03)
>>> system x86_64, mingw32
>>> ui Rgui
>>> language (EN)
>>> collate English_United States.1252
>>> tz America/New_York
>>> date 2016-06-17
>>>
>>> Packages
>>> ---------------------------------------------------------------------------------------------------------------
>>> package * version date source
>>> devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
>>> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
>>> downloader * 0.4 2015-07-09 CRAN (R 3.3.0)
>>> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
>>> withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>>>>
>>>
>>>
>>> If I open the same url on my browser and manually download the file,
>>> then everything works as shown below:
>>>
>>>> load('metadata_clean_sra.Rdata')
>>>> metadata_clean
>>> Loading required package: S4Vectors
>>> Loading required package: stats4
>>> Loading required package: BiocGenerics
>>> Loading required package: parallel
>>> ## removed more output
>>>
>>>> options(width = 120)
>>>> devtools::session_info()
>>> Session info
>>> -----------------------------------------------------------------------------------------------------------
>>> setting value
>>> version R version 3.3.0 (2016-05-03)
>>> system x86_64, mingw32
>>> ui Rgui
>>> language (EN)
>>> collate English_United States.1252
>>> tz America/New_York
>>> date 2016-06-17
>>>
>>> Packages
>>> ---------------------------------------------------------------------------------------------------------------
>>> package * version date source
>>> BiocGenerics * 0.19.1 2016-06-17 Bioconductor
>>> devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
>>> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
>>> IRanges * 2.7.2 2016-06-07 Bioconductor
>>> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
>>> S4Vectors * 0.11.3 2016-06-03 Bioconductor
>>> withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>>>> print(object.size(metadata_clean), units = 'Mb')
>>> 30.5 Mb
>>>
>>> The object itself is a DataFrame and was created using R 3.3.1 with
>>> S4Vectors version 0.11.4. I get the same error if using a Unix machine
>>> I re-save the data using R 3.3.0 (with S4Vectors from Bioc-release).
>>>
>>> Some google leads are "corrupt file" or something about a hidden
>>> session Rdata file. But from the manual test, everything looks line.
>>> Unless downloader::download() (or alternatively utils::download.file()
>>> ) is corrupting the file.
>>>
>>>
>>> An option would be to include the data in the package, but I'd like to
>>> avoid doing so to minimize the package size. It already has a big
>>> data.frame that is necessary for the package to work. This short
>>> function is there for convenience.
>>>
>>> Best,
>>> Leo
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>
>>
>> --
>> Gabriel Becker, Ph.D
>> Associate Scientist
>> Bioinformatics and Computational Biology
>> Genentech Research
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
This email message may contain legally privileged and/or...{{dropped:2}}
More information about the Bioc-devel
mailing list