[Bioc-devel] Windows-only issue with downloading a Rdata file and loading it with R
Leonardo Collado Torres
lcollado at jhu.edu
Sat Jun 18 06:58:40 CEST 2016
Hi,
I get the same error while hosting the data somewhere else or when using
RawGit's url. That is:
> library('downloader')
> download('
http://www.biostat.jhsph.edu/~lcollado/recount/metadata_clean_sra.Rdata',
destfile = 'test2.Rdata')
> load('tes2t.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
> download('
https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata',
destfile = 'test3.Rdata')
> load('test3.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
Again, it only happens on Windows but not on the other OS. So it doesn't
look like a GitHub issue.
Best,
Leo
On Fri, Jun 17, 2016 at 4:57 PM, Gabe Becker <becker.gabe at gene.com> wrote:
> I wonder if raw only means "raw after line return munging"? can you attach
> the file that gets downloaded via email? (off list is fine)
>
> On Fri, Jun 17, 2016 at 1:44 PM, Leonardo Collado Torres <lcollado at jhu.edu
> > wrote:
>
>> Hi,
>>
>> I'm trying to figure out what is going wrong with an error that pops
>> up on Windows only. It's currently the only error for a package that I
>> recently submitted to Bioc. The function is fairly simple: it
>> downloads a Rdata file from the web and loads it.
>>
>> If I try to download and load the file with R, the following error
>> occurs (only on Windows):
>>
>>
>> > library('downloader')
>> > download('
>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true',
>> destfile = 'test.Rdata')
>> trying URL '
>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true
>> '
>> Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
>> downloaded 2.4 MB
>>
>> > load('test.Rdata')
>> Error: ReadItem: unknown type 50, perhaps written by later version of R
>> > traceback()
>> 1: load("test.Rdata")
>> > options(width = 120)
>> > devtools::session_info()
>> Session info
>> -----------------------------------------------------------------------------------------------------------
>> setting value
>> version R version 3.3.0 (2016-05-03)
>> system x86_64, mingw32
>> ui Rgui
>> language (EN)
>> collate English_United States.1252
>> tz America/New_York
>> date 2016-06-17
>>
>> Packages
>> ---------------------------------------------------------------------------------------------------------------
>> package * version date source
>> devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
>> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
>> downloader * 0.4 2015-07-09 CRAN (R 3.3.0)
>> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
>> withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>> >
>>
>>
>> If I open the same url on my browser and manually download the file,
>> then everything works as shown below:
>>
>> > load('metadata_clean_sra.Rdata')
>> > metadata_clean
>> Loading required package: S4Vectors
>> Loading required package: stats4
>> Loading required package: BiocGenerics
>> Loading required package: parallel
>> ## removed more output
>>
>> > options(width = 120)
>> > devtools::session_info()
>> Session info
>> -----------------------------------------------------------------------------------------------------------
>> setting value
>> version R version 3.3.0 (2016-05-03)
>> system x86_64, mingw32
>> ui Rgui
>> language (EN)
>> collate English_United States.1252
>> tz America/New_York
>> date 2016-06-17
>>
>> Packages
>> ---------------------------------------------------------------------------------------------------------------
>> package * version date source
>> BiocGenerics * 0.19.1 2016-06-17 Bioconductor
>> devtools 1.11.1 2016-04-21 CRAN (R 3.3.0)
>> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
>> IRanges * 2.7.2 2016-06-07 Bioconductor
>> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
>> S4Vectors * 0.11.3 2016-06-03 Bioconductor
>> withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
>> > print(object.size(metadata_clean), units = 'Mb')
>> 30.5 Mb
>>
>> The object itself is a DataFrame and was created using R 3.3.1 with
>> S4Vectors version 0.11.4. I get the same error if using a Unix machine
>> I re-save the data using R 3.3.0 (with S4Vectors from Bioc-release).
>>
>> Some google leads are "corrupt file" or something about a hidden
>> session Rdata file. But from the manual test, everything looks line.
>> Unless downloader::download() (or alternatively utils::download.file()
>> ) is corrupting the file.
>>
>>
>> An option would be to include the data in the package, but I'd like to
>> avoid doing so to minimize the package size. It already has a big
>> data.frame that is necessary for the package to work. This short
>> function is there for convenience.
>>
>> Best,
>> Leo
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>
>
> --
> Gabriel Becker, Ph.D
> Associate Scientist
> Bioinformatics and Computational Biology
> Genentech Research
>
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list