[Bioc-devel] include data from non self-sufficient .R files

Vincent Carey stvjc at channing.harvard.edu
Tue Oct 12 19:13:14 CEST 2010


If I understand correctly, Laurent wants to avoid serializing S4
objects (in data folder), probably because he may have to remember to
update the serializations if the class definitions change, even if the
serialized data is not affected by such definition changes.
The SNPlocs.* packages address this concern -- while there could be
nice advantages to having SNP metadata serialized as GRanges
instances, the actual data is well-represented in a SQLite table and
one uses a function to get a GRanges representation when desired.
This avoids the problem of requiring reserialization of a potentially
large object whenever the GRanges class definition changes.

I doubt there is a completely satisfactory solution to this problem.
Especially when one S4 class extends another that has unstable
definition, serialized S4 instances can become a bit problematic to
maintain.  A number of bioc core packages illustrate how updateObject
methods can be written to simplify aspects of maintenance.  The class
versioning discipline defined in Biobase is also useful for
maintenance, but I do not know how broadly it is used in contributed
packages.

On Tue, Oct 12, 2010 at 12:12 PM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Hi,
>
> On Tue, Oct 12, 2010 at 3:39 AM, Laurent Gatto <laurent.gatto at gmail.com> wrote:
>> Hi Sean,
>>
>> On 12 October 2010 00:15, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>>>
>>>
>>> On Mon, Oct 11, 2010 at 6:58 PM, Laurent Gatto <laurent.gatto at gmail.com>
>>> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I have some data to be included in a package. The data needs to be
>>>> easily loadable by the users, optimally with data(), and consists of a
>>>> set of class instances, classes that are defined in the package
>>>> itself.
>>>> Something like:
>>>> myData1 <- new("aClass",...) ## in myData1.R
>>>> myData2 <- new("aClass",...) ## in myData2.R
>>>>
>>>> that I would like to be loadable with
>>>>
>>>> data(myData1)
>>>> data(myData2)
>>>>
>>>> Although putting the .R files that generate the data objects directly
>>>> in the data directory would be the simplest solution, I can not this
>>>> because the code is not self-sufficient. I can't figure out how to
>>>> easily and automatically include these.
>>>> What is the suggested way to include this kind of data in a package?
>>>>
>>>> Thank you very much in advance.
>>>>
>>>
>>> Hi, Laurent.
>>> You can save() your data objects and put them in your data directory.  That
>>> should do it.  Of course, you will want to document them, also.  You can
>>> look at the "Writing R Extensions" manual for more details.
>>> Sean
>>>
>>
>> Thank you for the advice. I was rather looking for an automatic way of
>> including the objects at installation time, as the code is readily
>> available in the package. But as the class and these objects are not
>> likely to change too much in the future, adding them once manually is
>> fine, of course.
>> By the way, I tried to have an R source (in ints/scripts/ or R/)
>> create the instances and save() them in data/, but without success.
>
> I'm not sure that I follow (or that you follow :-), but what Sean is
> suggesting is that you create the data yourself ... manually. Not by
> some automated R script, and then you save() the data into an RData
> file that you put into your package's /data directory.
>
> Once the user installs the package, the data will come with the
> package ("automatically"). The user can then load that data by using
> the data() function calls, as you mentioned.
>
> A call like:
>
> R> data(myData1)
>
> Would then work if you have an *.RData file called "myData1.RData" in
> your packages /data folder.
>
> See the help in ?data to get a more detailed overview of how data is
> searched for and loaded.
>
> Does that help?
>
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list