[Bioc-devel] What is Bioconductor's position on allowing users to create files in the working directory without an explicit path definition in the filename

Shepherd, Lori Lor|@Shepherd @end|ng |rom Ro@we||P@rk@org
Fri Mar 22 17:02:57 CET 2019

To chime and expand a bit on Vince's comments:

I feel Bioconductor's position when accepting packages , with few exceptions,  is that nothing should be written or saved to a users directory without the expressed permission of the user for fear of overwriting a users own directory or files previous to the packages intended use.  As Vince explained

For this reason we recommend that the defaults to all function and usage in man/vignettes/tests be written to the tempdir()/tempfile() options.  If the package documentation is clear,  than it should be known in practical use the user should specify a more permanent location for the file creation rather than a temporary location.

If a file is suppose to persist, BiocFileCache is an option for monitoring and storing files and is becoming a more standard way of organizing files.    There is the idea of saving objects to the cache with a given "rname" that would be a unique identifier.  Using that identifier, your package or the users should be able to use bfcquery  to query the cache and retrieve the file path.  As Vince said, this should then be documented in your package.   Without thoroughly understanding the implementation of your package this might be of use to you.

Less likely:  Depending on its implementation in your package, you may also find the bfcadd function has an option of  action = c("copy", "move", "asis")   which controls if the file is moved into the BiocFileCache default directory, copied from the location,  or left in the original location.


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of Vincent Carey <stvjc using channing.harvard.edu>
Sent: Friday, March 22, 2019 11:55:05 AM
To: Koustav Pal
Cc: bioc-devel; Ferrari Francesco
Subject: Re: [Bioc-devel] What is Bioconductor's position on allowing users to create files in the working directory without an explicit path definition in the filename

Guidelines on this topic do not seem to be present in our web
site; there is a link to Wickham's guide but I don't see that it
confronts the topic.  I will make some unofficial and possibly
wrong remarks.

Suppose my function has to create a file "foo.txt".  If I do it
in the working folder, I might destroy a user's cherished file.
So I should check to see if the filename I need is already in use.
If it is, I need to do something graceful.

That's a lot of complexity that may never actually be used.  Can
we avoid it completely?  Here are a few ways to avoid it:

1) Don't create files, just create objects and leave the serialization
task to the user.  You can provide helper functions and documentation but
the details of target location of the serialization are left to the user.

2) If you create a file, use R's tempfile/tempdir discipline to avoid
the need for checking for clobber.  If the content needs to persist the
user should direct this, again with helpers as needed.

3) If you create a file that should persist, use BiocFileCache as that
addresses the location problem and has an added benefit of obligatory
metadata binding.  This is an underused strategy and more pedagogy
is surely in order.  If the user "cannot find" what has been made, there
is a systematic approach available that involves querying the cache.  Your
documentation will supply all relevant details.

On Fri, Mar 22, 2019 at 11:38 AM Koustav Pal <koustav.pal using ifom.eu> wrote:

> Hello,
> My package HiCBricks was submitted and accepted under the previous 3.8
> release of Bioconductor.
> At the time, during package review, my reviewer had expressed reservations
> towards my package creating
> files in the current working directory.
> [REQUIRED] CreateLego() creates HDF5 files in the current directory if no
> path is given in the Output.Filename argument. This may clutter the working
> directory and it would be better to have the files saved to a temporary
> file
> (or directory) using tempfile() (or tempdir()).
> This was with regards to the main output files that were being created by
> my package.
> I clarified the specific point in question with my reviewer.
> The idea behind this package is to create a HDF file for storing
> high-resolution Hi-C (can be as large as a user wants) data and keep it as
> a persistent copy which the user can access later without having to reload
> the file. Therefore, I am a bit averse towards creating a tempfile or
> tempdir. Using a temporary file would go against this idea and would
> probably result in the user not having access to the file later. I have
> incorporated a control statement which will issue a warning regarding file
> creation inside the current working directory. Is that ok?
> Finally, my reviewer suggested that I make use of the BiocFileCache
> package to create files.
> The changes so far look good. I understand that tempfile() isn't a great
> solution for your package, so may I recommend that you store your data
> using the BiocFileCache package
> https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html <
> https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html>
> as opposed to automatically saving the file in a local directory. Once this
> change is made, I should be able to accept the package.
> I interpreted this as the reviewer expressing reservation towards files
> being created in the
> current working directory without the user's explicit requirement.
> Therefore, I made a working
> implementation of BiocFileCache within my package, which works perfectly
> fine.
> Yet, users are now facing troubles when having to locate files that they
> may have created in the current
> working directory using the traditional method of var = �something.txt�,
> because these files were created in
> the BiocFileCache cache during file creation. All the confusion and issue
> stems from this being a non-traditional
> method of keeping track of files and folders.
> What is Bioconductor�s position regarding this issue?
> Can users create files using Bioconductor packages in the current working
> directory without an explicit path definition in the filename?
> Or did I misinterpret the reviewer�s position and this is only a
> requirement when the package is being build by the builder?
> Koustav Pal,
> Post-Doctoral Fellow in Genome Architecture,
> Computational Genomics Group,
> IFOM - The FIRC Institute of Molecular Oncology,
> Via Adamello 16,
> 20139 Milano, Italy.
> Phone: +393441130157
> E-mail: koustav.pal using ifom.eu
>         [[alternative HTML version deleted]]
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

The information in this e-mail is intended only for the ...{{dropped:17}}

More information about the Bioc-devel mailing list