[Bioc-devel] What is Bioconductor's position on allowing users to create files in the working directory without an explicit path definition in the filename

Vincent Carey @tvjc @end|ng |rom ch@nn|ng@h@rv@rd@edu
Fri Mar 22 16:55:05 CET 2019


Guidelines on this topic do not seem to be present in our web
site; there is a link to Wickham's guide but I don't see that it
confronts the topic.  I will make some unofficial and possibly
wrong remarks.

Suppose my function has to create a file "foo.txt".  If I do it
in the working folder, I might destroy a user's cherished file.
So I should check to see if the filename I need is already in use.
If it is, I need to do something graceful.

That's a lot of complexity that may never actually be used.  Can
we avoid it completely?  Here are a few ways to avoid it:

1) Don't create files, just create objects and leave the serialization
task to the user.  You can provide helper functions and documentation but
the details of target location of the serialization are left to the user.

2) If you create a file, use R's tempfile/tempdir discipline to avoid
the need for checking for clobber.  If the content needs to persist the
user should direct this, again with helpers as needed.

3) If you create a file that should persist, use BiocFileCache as that
addresses the location problem and has an added benefit of obligatory
metadata binding.  This is an underused strategy and more pedagogy
is surely in order.  If the user "cannot find" what has been made, there
is a systematic approach available that involves querying the cache.  Your
documentation will supply all relevant details.

On Fri, Mar 22, 2019 at 11:38 AM Koustav Pal <koustav.pal using ifom.eu> wrote:

> Hello,
>
> My package HiCBricks was submitted and accepted under the previous 3.8
> release of Bioconductor.
>
> At the time, during package review, my reviewer had expressed reservations
> towards my package creating
> files in the current working directory.
>
>
> [REQUIRED] CreateLego() creates HDF5 files in the current directory if no
> path is given in the Output.Filename argument. This may clutter the working
> directory and it would be better to have the files saved to a temporary
> file
> (or directory) using tempfile() (or tempdir()).
>
>
> This was with regards to the main output files that were being created by
> my package.
> I clarified the specific point in question with my reviewer.
>
>
> The idea behind this package is to create a HDF file for storing
> high-resolution Hi-C (can be as large as a user wants) data and keep it as
> a persistent copy which the user can access later without having to reload
> the file. Therefore, I am a bit averse towards creating a tempfile or
> tempdir. Using a temporary file would go against this idea and would
> probably result in the user not having access to the file later. I have
> incorporated a control statement which will issue a warning regarding file
> creation inside the current working directory. Is that ok?
>
>
> Finally, my reviewer suggested that I make use of the BiocFileCache
> package to create files.
>
>
> The changes so far look good. I understand that tempfile() isn't a great
> solution for your package, so may I recommend that you store your data
> using the BiocFileCache package
> https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html <
> https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html>
> as opposed to automatically saving the file in a local directory. Once this
> change is made, I should be able to accept the package.
>
> I interpreted this as the reviewer expressing reservation towards files
> being created in the
> current working directory without the user's explicit requirement.
> Therefore, I made a working
> implementation of BiocFileCache within my package, which works perfectly
> fine.
>
> Yet, users are now facing troubles when having to locate files that they
> may have created in the current
> working directory using the traditional method of var = “something.txt”,
> because these files were created in
> the BiocFileCache cache during file creation. All the confusion and issue
> stems from this being a non-traditional
> method of keeping track of files and folders.
>
> What is Bioconductor’s position regarding this issue?
>
> Can users create files using Bioconductor packages in the current working
> directory without an explicit path definition in the filename?
>
> Or did I misinterpret the reviewer’s position and this is only a
> requirement when the package is being build by the builder?
>
>
> Koustav Pal,
> Post-Doctoral Fellow in Genome Architecture,
> Computational Genomics Group,
> IFOM - The FIRC Institute of Molecular Oncology,
> Via Adamello 16,
> 20139 Milano, Italy.
> Phone: +393441130157
> E-mail: koustav.pal using ifom.eu
>
>
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
The information in this e-mail is intended only for the ...{{dropped:18}}



More information about the Bioc-devel mailing list