[Bioc-devel] Windows-only issue with downloading a Rdata file and loading it with R

Leonardo Collado Torres lcollado at jhu.edu
Sat Jun 18 06:58:40 CEST 2016


Hi,

I get the same error while hosting the data somewhere else or when using
RawGit's url. That is:

> library('downloader')
> download('
http://www.biostat.jhsph.edu/~lcollado/recount/metadata_clean_sra.Rdata',
destfile = 'test2.Rdata')
> load('tes2t.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R
> download('
https://cdn.rawgit.com/leekgroup/recount-website/master/metadata/metadata_clean_sra.Rdata',
destfile = 'test3.Rdata')
> load('test3.Rdata')
Error: ReadItem: unknown type 50, perhaps written by later version of R

Again, it only happens on Windows but not on the other OS. So it doesn't
look like a GitHub issue.

Best,
Leo


On Fri, Jun 17, 2016 at 4:57 PM, Gabe Becker <becker.gabe at gene.com> wrote:

> I wonder if raw only means "raw after line return munging"? can you attach
> the file that gets downloaded  via email? (off list is fine)
>
> On Fri, Jun 17, 2016 at 1:44 PM, Leonardo Collado Torres <lcollado at jhu.edu
> > wrote:
>
>> Hi,
>>
>> I'm trying to figure out what is going wrong with an error that pops
>> up on Windows only. It's currently the only error for a package that I
>> recently submitted to Bioc. The function is fairly simple: it
>> downloads a Rdata file from the web and loads it.
>>
>> If I try to download and load the file with R, the following error
>> occurs (only on Windows):
>>
>>
>> > library('downloader')
>> > download('
>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true',
>> destfile = 'test.Rdata')
>> trying URL '
>> https://github.com/leekgroup/recount-website/blob/master/metadata/metadata_clean_sra.Rdata?raw=true
>> '
>> Content type 'application/octet-stream' length 2531337 bytes (2.4 MB)
>> downloaded 2.4 MB
>>
>> > load('test.Rdata')
>> Error: ReadItem: unknown type 50, perhaps written by later version of R
>> > traceback()
>> 1: load("test.Rdata")
>> > options(width = 120)
>> > devtools::session_info()
>> Session info
>> -----------------------------------------------------------------------------------------------------------
>>  setting  value
>>  version  R version 3.3.0 (2016-05-03)
>>  system   x86_64, mingw32
>>  ui       Rgui
>>  language (EN)
>>  collate  English_United States.1252
>>  tz       America/New_York
>>  date     2016-06-17
>>
>> Packages
>> ---------------------------------------------------------------------------------------------------------------
>>  package    * version date       source
>>  devtools     1.11.1  2016-04-21 CRAN (R 3.3.0)
>>  digest       0.6.9   2016-01-08 CRAN (R 3.3.0)
>>  downloader * 0.4     2015-07-09 CRAN (R 3.3.0)
>>  memoise      1.0.0   2016-01-29 CRAN (R 3.3.0)
>>  withr        1.0.1   2016-02-04 CRAN (R 3.3.0)
>> >
>>
>>
>> If I open the same url on my browser and manually download the file,
>> then everything works as shown below:
>>
>> > load('metadata_clean_sra.Rdata')
>> > metadata_clean
>> Loading required package: S4Vectors
>> Loading required package: stats4
>> Loading required package: BiocGenerics
>> Loading required package: parallel
>> ## removed more output
>>
>> > options(width = 120)
>> > devtools::session_info()
>> Session info
>> -----------------------------------------------------------------------------------------------------------
>>  setting  value
>>  version  R version 3.3.0 (2016-05-03)
>>  system   x86_64, mingw32
>>  ui       Rgui
>>  language (EN)
>>  collate  English_United States.1252
>>  tz       America/New_York
>>  date     2016-06-17
>>
>> Packages
>> ---------------------------------------------------------------------------------------------------------------
>>  package      * version date       source
>>  BiocGenerics * 0.19.1  2016-06-17 Bioconductor
>>  devtools       1.11.1  2016-04-21 CRAN (R 3.3.0)
>>  digest         0.6.9   2016-01-08 CRAN (R 3.3.0)
>>  IRanges      * 2.7.2   2016-06-07 Bioconductor
>>  memoise        1.0.0   2016-01-29 CRAN (R 3.3.0)
>>  S4Vectors    * 0.11.3  2016-06-03 Bioconductor
>>  withr          1.0.1   2016-02-04 CRAN (R 3.3.0)
>> > print(object.size(metadata_clean), units = 'Mb')
>> 30.5 Mb
>>
>> The object itself is a DataFrame and was created using R 3.3.1 with
>> S4Vectors version 0.11.4. I get the same error if using a Unix machine
>> I re-save the data using R 3.3.0 (with S4Vectors from Bioc-release).
>>
>> Some google leads are "corrupt file" or something about a hidden
>> session Rdata file. But from the manual test, everything looks line.
>> Unless downloader::download() (or alternatively utils::download.file()
>> ) is corrupting the file.
>>
>>
>> An option would be to include the data in the package, but I'd like to
>> avoid doing so to minimize the package size. It already has a big
>> data.frame that is necessary for the package to work. This short
>> function is there for convenience.
>>
>> Best,
>> Leo
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>
>
> --
> Gabriel Becker, Ph.D
> Associate Scientist
> Bioinformatics and Computational Biology
> Genentech Research
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list