[Bioc-devel] How to use RData files in Bioconductor data and software packages
webwork|ng @end|ng |rom po@teo@de
Thu Jan 9 22:00:52 CET 2020
thank you for your detailed answer. I guess I have expressed myself
unclear. The BED files were just examples for data I store in the
inst/extdata folder. Based on the description for ExperimentHubData I
have decided to create a software and a data package (no
ExperimentHubData software package). In my RData files I store software
package objects. These objects are bigger than 5 MB. Using a helper
function is no option, because the object calculation takes to much
time. For this reason I want to load this objects for my example
functions. My question is if the storage of my RData files in the
inst/extdata directory is correct or not.
Am 09.01.20 um 17:59 schrieb Pages, Herve:
> Hi Tobias,
> If the original data is in BED files, there should be no need to
> serialize the objects obtained by importing the files. It is **much**
> better to provide a small helper function that creates an object from a
> BED file and to use that function each time you need to load an object.
> This has at least 2 advantages:
> 1. It avoids redundant storage of the data.
> 2. By avoiding serialization of high-level S4 objects, it makes the
> package easier to maintain in the long run.
> Note that the helper function could also implement a cache mechanism
> (easy to do with an environment) so the BED file is only loaded and the
> object created the 1st time the function is called. On subsequent calls,
> the object is retrieved from the cache.
> However, if the BED files are really big (e.g. > 50 Mb), we require them
> to be stored on ExperimentHub instead of inside dummyData. Note that you
> still need to provide the dummyData package (which becomes what we call
> an ExperimentHub-based data package). See the "Creating An ExperimentHub
> Package" vignette in the ExperimentHubData package for more information
> about this.
> Hope this helps,
> On 1/9/20 04:45, web working wrote:
>> Dear all,
>> I am currently developing a software package (dummySoftware) and a data
>> package (dummyData) and I am a bit confused in where to store my RData
>> files in the data package. Here my situation:
>> I want to store some software package objects (new class objects of the
>> software package) in the data package. This objects are example objects
>> and a to big for software packages. As I have read here
>> ) all RData objects should be stored in the data directory of a package.
>> BED files of the data package are stored in inst/extdata.
>> The data of the data packaged will be addressed in the software package
>> like this: system.file('extdata', 'subset.bed', package = 'dummyData').
>> And here the problem occurs. After building the data package
>> (devtools::build(args = c('--resave-data'))), all data in data/ are
>> stored in a datalist, Rdata.rdb, Rdata.rds and Rdata.rdx and can not
>> addressed with system.file. Addressing this data with the data()
>> function results in a warning during BiocCheck::BiocCheck().
>> My solution is to store the RData files in the inst/extdata directory
>> and address them with system.file. Something similar is mentioned here,
>> but in the context of a vignette
>> (r-pkgs.had.co.nz/data.html#other-data). Is this the way how to do it?
>> Bioc-devel using r-project.org mailing list
More information about the Bioc-devel