[Bioc-devel] Zipped Rdata files in windows binaries

Tarca, Adi atarca at med.wayne.edu
Fri Feb 6 22:36:35 CET 2009


Dear Robert,

Thank you for your advice. I will then put all the data at the build time as .RData files. The only issue I had was that I did not know if there is a limitation in terms of disk space occupied by these files. I am talking here about around 10 MB but it may double in the future releases. 
I was not too worried about people needing an internet connection when using my package in conjunction with a new organism for the first time, since is the same thing as trying to use some affy functions on a chip for which you do not have the cdf (except that you do not download a file but an additional package). 

Regards,
Adi  

  


Adi Laurentiu Tarca, PhD
Assistant Professor (Research), 
Bioinformatics and Computational Biology Unit of the NIH Perinatology Research Branch,
Department of Computer Science & Center for Molecular Medicine and Genetics,
Wayne State University, 
3990 John R., Office 4809,
Detroit, Michigan 48201
Tel: 1-313-5775305 
Cell: 1-313-4043116 
http://bioinformaticsprb.med.wayne.edu/tarca/

-----Original Message-----
From: rgentlem at gmail.com [mailto:rgentlem at gmail.com] On Behalf Of Robert Gentleman
Sent: Friday, February 06, 2009 3:56 PM
To: Tarca, Adi
Cc: bioc-devel at stat.math.ethz.ch
Subject: Re: [Bioc-devel] Zipped Rdata files in windows binaries

Hi,

On Fri, Feb 6, 2009 at 11:05 AM, Tarca, Adi <atarca at med.wayne.edu> wrote:
>
>
> Hi all,
> I am writing an R packge and at a given point I need to load an Rdata file from the "data" folder of the installed package, and in case the file it is not there I try to download it from somwhere.
>
  That does not sound like a good thing to do.  The data folder is exclusively for data that is stored essentially at package build time and is not a place to put other files, or to use during a session to store objects.  Objects there are platform independent and are accessed using the data command in R.  Please don't try to modify this behavior.

   If you want/need to have your own data storage type and want to control it, you should use a different folder.  A common choice is inst/extdata.  And then you are in control of everything.

   Since lots of people use R in cases where they do not have access to the internet, the idea that they should download something for your package to work seems problematic.  Why not just use one of the many platform independent formats and distribute the data on all platforms in the same way.

   There are a number of examples in Bioconductor packages (eg simpleaffy or flowCore)

  Best wishes
    Robert


> I used to do the following test to see if a file called "datload" is NOT there, case in which I need to download it:
>
>  if(! paste(datload,".RData",sep="") %in% 
> dir(system.file("data",package="SPIA"))) {  ...download the file from 
> somwhere else }
>
> It works fine except that the windows binary package created by bioconductor scripts from my source, puts all RData file in a Rdata.zip file. Is there a way to list the files in Rdata.zip to see if my file is in there?
>
> Alternatively I tried to use the data() function and try to load it (in a private environment), and in case it is not loaded  then try to download it. However, the data() function does not return an error but only a warning.
> I tried to use:
>
> ow <- options("warn")
> options(warn=2) # to make warnings into errors 
> errs<-try(data(list=datload, envir=.myDataEnv),silent=TRUE)
>
>  if(class(errs)!="try-error"){
>  ...download the file from somwhere else  }
>
> This works fine, except that a warning is still printed when the function returns.
>
> Any ideas would be appreciated.
>
>
> Thanks,
> Adi Laurentiu Tarca
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>



--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioc-devel mailing list