[Bioc-devel] License question for experimental data package

Tim Triche, Jr. tim.triche at gmail.com
Fri Mar 4 16:41:00 CET 2016


I was going to mention droit d'auteur under EU common law, but somehow that seemed more in Hervé's wheelhouse ;-). 

--t

> On Mar 4, 2016, at 7:17 AM, Lyle Burgoon <burgoon.lyle at gmail.com> wrote:
> 
> Also keep in mind US copyright rules for data are different from European. We ran into this recently when wanting to publish european data from a web database.
> 
>> On Mar 4, 2016 10:05 AM, "Tim Triche, Jr." <tim.triche at gmail.com> wrote:
>> Data (facts) are not copyright worthy, but databases (collections of facts) can be.  See Feist v Rural for precedent; in short, there must be an inobvious and creative aspect to the database for it to be elevated to copyrightable status.  I doubt that a collection of datasets would clear this bar, but it's still worth noting.
>> 
>> --t
>> 
>> > On Mar 4, 2016, at 6:22 AM, Robert M. Flight <rflight79 at gmail.com> wrote:
>> >
>> > I am pretty sure in general "data" is not copyrightable per se (
>> > http://www.lib.umich.edu/copyright/facts-and-data), so while I might
>> > contact the original authors as a courtesy, if the data has been released
>> > into any public database, then you should be free to do with it as you
>> > please. Providing the original accession numbers for the data and relevant
>> > citations (if they exist) so that it is easy for you and others to be given
>> > credit if the data is used would be a good thing to do.
>> >
>> > Also, I would personally go with the CC0 (waive of copyright, see
>> > https://wiki.creativecommons.org/wiki/CC0) for a data package, as the data
>> > is already publicly available, you have just packaged it together into a
>> > useful set.
>> >
>> > My 2 cents.
>> >
>> > -Robert
>> >
>> > Robert M Flight, PhD
>> > Bioinformatics Research Associate
>> > Resource Center for Stable Isotope Resolved Metabolomics
>> > Manager, Systems Biology and Omics Integration Journal Club
>> > Markey Cancer Center
>> > CC434 Roach Building
>> > University of Kentucky
>> > Lexington, KY
>> >
>> > Twitter: @rmflight
>> > Web: rmflight.github.io
>> > ORCID: http://orcid.org/0000-0001-8141-7788
>> > EM rflight79 at gmail.com
>> > PH 502-509-1827
>> >
>> > To call in the statistician after the experiment is done may be no more
>> > than asking him to perform a post-mortem examination: he may be able to say
>> > what the experiment died of. - Ronald Fisher
>> >
>> >
>> >
>> > On Fri, Mar 4, 2016 at 8:52 AM Kasper Daniel Hansen <
>> > kasperdanielhansen at gmail.com> wrote:
>> >
>> >> For data packages, which does not contain any code, it seems weird to use a
>> >> software license such as GPL or GPL-2.  It seems better to use something
>> >> like Artistic-2.0 or one of the CC licenses.
>> >>
>> >> On Thu, Mar 3, 2016 at 5:15 PM, davide risso <risso.davide at gmail.com>
>> >> wrote:
>> >>
>> >>> Hi Hervé and Sean,
>> >>>
>> >>> thanks for your help. It will indeed be interesting to hear how other
>> >>> people chose the license, especially for those package that redistribute
>> >> a
>> >>> dataset not from their lab.
>> >>>
>> >>> I do have an experimental data package in Bioc, zebrafishRNASeq, but it's
>> >>> an experiment from a collaborator and at the time I didn't pay much
>> >>> attention on which license to use.
>> >>> In this case, I'd like to redistribute data from different labs. I guess
>> >> I
>> >>> will contact the original authors at least as a courtesy.
>> >>> But I'm still keen to hear opinions on which license(s) is appropriate
>> >> for
>> >>> experimental data sharing.
>> >>>
>> >>> Best,
>> >>> davide
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Mar 3, 2016 at 12:50 PM Hervé Pagès <hpages at fredhutch.org>
>> >> wrote:
>> >>>
>> >>>> Hi Davide,
>> >>>>
>> >>>>> On 03/01/2016 02:25 PM, davide risso wrote:
>> >>>>> Dear Bioc developers,
>> >>>>>
>> >>>>> I recently downloaded three publicly available single-cell RNA-seq
>> >>>> datasets
>> >>>>> from the NCBI GEO/SRA repository and created an R package with some
>> >>>>> gene-level summaries (read counts and FPKMs).
>> >>>>>
>> >>>>> I'm currently using the package locally for my own tests, but I'm
>> >>>> thinking
>> >>>>> that this may be a useful resource for the community and thinking of
>> >>>>> sharing it on github and eventually submit it to Bioconductor.
>> >>>>>
>> >>>>> I was not involved in any way with the original studies, and I'm
>> >>>> wondering
>> >>>>> what is the best practice in terms of license / data sharing. Since
>> >>> there
>> >>>>> are many experimental data packages in Bioconductor, I'm guessing
>> >> that
>> >>>> I'm
>> >>>>> not the first person wondering about this.
>> >>>>>
>> >>>>>> From the NCBI website, I read (quote from
>> >>>>> https://www.ncbi.nlm.nih.gov/home/about/policies.shtml):
>> >>>>> Databases of molecular data on the NCBI Web site include such
>> >> examples
>> >>> as
>> >>>>> nucleotide sequences (GenBank), protein sequences, macromolecular
>> >>>>> structures, molecular variation, gene expression, and mapping data.
>> >>> They
>> >>>>> are designed to provide and encourage access within the scientific
>> >>>>> community to sources of current and comprehensive information.
>> >>> Therefore,
>> >>>>> NCBI itself places no restrictions on the use or distribution of the
>> >>> data
>> >>>>> contained therein. Nor do we accept data when the submitter has
>> >>> requested
>> >>>>> restrictions on reuse or redistribution. However, some submitters of
>> >>> the
>> >>>>> original data (or the country of origin of such data) may claim
>> >> patent,
>> >>>>> copyright, or other intellectual property rights in all or a portion
>> >> of
>> >>>> the
>> >>>>> data (that has been submitted). NCBI is not in a position to assess
>> >> the
>> >>>>> validity of such claims and since there is no transfer of rights from
>> >>>>> submitters to NCBI, NCBI has no rights to transfer to a third party.
>> >>>>> Therefore, NCBI cannot provide comment or unrestricted permission
>> >>>>> concerning the use, copying, or distribution of the information
>> >>> contained
>> >>>>> in the molecular databases.
>> >>>>>
>> >>>>> Should I contact the original authors for permission? Or is the fact
>> >>> that
>> >>>>> the data were publicly shared enough to grant me permission to
>> >>>> redistribute?
>> >>>>> In that case, is there a standard license that I should use?
>> >>>>>
>> >>>>> Thanks for any feedback / thought!
>> >>>>
>> >>>> I don't have much to offer. AFAIK we don't really have guidelines or
>> >>>> recommendations for what license to use for experimental data packages,
>> >>>> except for the usual "make sure you use an appropriate license" advice.
>> >>>> So far it has really been up to each author/maintainer to make sure
>> >>>> they pick up a license that is compatible with the original
>> >>>> license/copyright/patent of the original data they are packaging
>> >>>> and with its redistribution thru the Bioconductor channel.
>> >>>>
>> >>>> FWIW here is a summary of the licenses used by the 276 experimental
>> >>>> data packages currently in BioC devel:
>> >>>>
>> >>>>   License       Nb of packages
>> >>>>   ------------  --------------
>> >>>>   GPL                      135
>> >>>>   Artistic-2.0              96
>> >>>>   LGPL                      41
>> >>>>   other                      4
>> >>>>
>> >>>> Would be interesting to hear from other developers about this. For
>> >>>> example, how people choose between GPL vs Artistic-2.0? Is one
>> >>>> license typically more appropriate for packaging and redistributing
>> >>>> data that is already publicly available?
>> >>>>
>> >>>> H.
>> >>>>
>> >>>>>
>> >>>>> Best,
>> >>>>> davide
>> >>>>>
>> >>>>>      [[alternative HTML version deleted]]
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Bioc-devel at r-project.org mailing list
>> >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >>>>>
>> >>>>
>> >>>> --
>> >>>> Hervé Pagès
>> >>>>
>> >>>> Program in Computational Biology
>> >>>> Division of Public Health Sciences
>> >>>> Fred Hutchinson Cancer Research Center
>> >>>> 1100 Fairview Ave. N, M1-B514
>> >>>> P.O. Box 19024
>> >>>> Seattle, WA 98109-1024
>> >>>>
>> >>>> E-mail: hpages at fredhutch.org
>> >>>> Phone:  (206) 667-5791
>> >>>> Fax:    (206) 667-1319
>> >>>>
>> >>>
>> >>>        [[alternative HTML version deleted]]
>> >>>
>> >>> _______________________________________________
>> >>> Bioc-devel at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >>>
>> >>
>> >>        [[alternative HTML version deleted]]
>> >>
>> >> _______________________________________________
>> >> Bioc-devel at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>> >    [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> 
>>         [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list