[BioC] how to build a R package with the inclusion of inst/extdata
Yue Li
gorillayue at gmail.com
Fri Sep 7 03:25:23 CEST 2012
Sorry Steve, I'm actually stuck at building the package with inst/extdata. This is my first time trying to build a R package, so please bear with me. Let me walk you through my (incorrect) approach:
I have a set of R scripts and Rd files that need to be built into a package. I deliberately make all examples in my Rd files trivial such as simply running ls() to pass the test. I can successfully build the package by running the following steps:
(1) construct package skeleton in R console:
scriptDir <- "~/Desktop/myRscripts/"
outDir <- "~/Desktop/"
sourceFiles <- list.files(path=scriptDir, pattern="[a-zA-Z]+\\.R$", full.names=TRUE, recursive=TRUE)
package.skeleton(name="mypackage", code_files=sourceFiles, path=outDir)
I now have a folder named "mypackage" sitting on my ~/Desktop. In a shell script, I do this:
(2) replace the skeleton Rd files in ~/Desktop/mypackage/man with my prepared Rd files by:
cp ~/Desktop/myRDfiles/*.Rd ~/Desktop/mypackage/man/
(3) R CMD build ~/Desktop/mypackage
(4) R CMD check ~/Desktop/mypackage_0.99.0.tar.gz
(5) R CMD INSTALL ~/Desktop/mypackage_0.99.0.tar.gz
All of the above steps work fine. But now I at the stage of writing concrete examples for each function and use R CMD check in step (4) to make sure that the examples do get run successfully during compilation time. Some of the examples involve using BAM files and I need to put them into the package so that the package gets shipped with these BAM files as test data exactly as the ShortRead package.
I learn that creating a subdirectory called "inst/extdata" inside the package folder (as in ShortRead) is a conventional way to put the test data in . So after step (2), I do this
cp inst/extdata ~/Desktop/mypackage
But then I cannot successfully perform (3) as it returns error:
$ R CMD build mypackage/
* checking for file ‘mypackage/DESCRIPTION’ ... OK
* preparing ‘mypackage’:
* checking DESCRIPTION meta-information ... OK
* excluding invalid files
Subdirectory 'man' contains invalid file names:
‘.Rhistory’
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘mypackage_0.99.0.tar.gz’
/usr/bin/gnutar: mypackage/inst/extdata/expt1/accepted_hits_noDup.bam: file changed as we read it
/usr/bin/gnutar: mypackage/inst/extdata/expt2/accepted_hits_noDup.bam: file changed as we read it
/usr/bin/gnutar: mypackage/inst/extdata/expt3/accepted_hits_noDup.bam: file changed as we read it
ERROR
packaging into .tar.gz failed
I'm just wondering at which step between (1) and (5) could I somehow incorporate the inst/extdata into the package and make the tar ball containing the inst/extdata.
Thanks much for your patient helps!
Yue
On 2012-09-06, at 7:50 PM, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:
> Hi,
>
> On Thu, Sep 6, 2012 at 7:21 PM, Yue Li <gorillayue at gmail.com> wrote:
>> Hi Steven,
>>
>> Thanks for the quick response. I think I probably didn't articulate my intend clearly.
>
> I actually understood your intent -- I thought you were confused on
> why you were getting some error when you ran the `R CMD build ...`
> command you posted previously.
>
> The problem was that you were trying to build something that wasn't
> really a package -- it seemed as if you were trying to build the
> *parent* directory your package directory was living in.
>
>> Basically, I'm trying to develop a R package rather than using someone else's package. In order to run some examples I have for the functions I wrote, I need to have BAM data saved in the "inst/extdata" (or anywhere for that matters). So when I call:
>>
>> R CMD check mypackage
>>
>> The example that says something like
>>
>> testfiles <- system.file("inst/extdata/*bam$", package = "mypackage", )
>>
>> can give me the BAM files saved in that inst/extdata/ that come with the tar ball package. But I'm too ignorant to figure out how to do that.
>
> If you want to do this pattern matching on *.bam, I'm pretty sure you
> can't do it in a call to system.file, so you'd first get a handle on
> your `extdata` directory, then call `dir` on it. For example (and to
> be extra explicit), assuming you install your package succesfully, you
> would then do in R:
>
> R> extdata.dir <- system.file("extdata", package="myPackage")
> R> bamfiles <- dir(extdata.dir, pattern="\\.bam$", full.names=TRUE)
>
> The directory structure of your package would look something like this:
>
> myPackage
> `- inst
> `- extdata
> `- data1.bam
> `- data2.bam
> `- R
> `- ...
> `- NAMESPACE
> `- DESCRIPTION
>
> And note that when you actually install the package, the contents
> inside the `inst` directory get "hoisted" out of it and dropped into
> the directory of your package, eg. after installation, on your
> filesystem the `extdata` directory would be something like:
>
> /path/to/your/R/library/myPackage/extdata/
>
> Download the source code of, say, the ShortRead package to see the
> structure you want to follow:
>
> http://www.bioconductor.org/packages/2.10/bioc/src/contrib/ShortRead_1.14.4.tar.gz
>
> HTH,
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list