[Bioc-devel] License question for experimental data package

Robert M. Flight rflight79 at gmail.com
Fri Mar 4 15:22:27 CET 2016


I am pretty sure in general "data" is not copyrightable per se (
http://www.lib.umich.edu/copyright/facts-and-data), so while I might
contact the original authors as a courtesy, if the data has been released
into any public database, then you should be free to do with it as you
please. Providing the original accession numbers for the data and relevant
citations (if they exist) so that it is easy for you and others to be given
credit if the data is used would be a good thing to do.

Also, I would personally go with the CC0 (waive of copyright, see
https://wiki.creativecommons.org/wiki/CC0) for a data package, as the data
is already publicly available, you have just packaged it together into a
useful set.

My 2 cents.

-Robert

Robert M Flight, PhD
Bioinformatics Research Associate
Resource Center for Stable Isotope Resolved Metabolomics
Manager, Systems Biology and Omics Integration Journal Club
Markey Cancer Center
CC434 Roach Building
University of Kentucky
Lexington, KY

Twitter: @rmflight
Web: rmflight.github.io
ORCID: http://orcid.org/0000-0001-8141-7788
EM rflight79 at gmail.com
PH 502-509-1827

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. - Ronald Fisher



On Fri, Mar 4, 2016 at 8:52 AM Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:

> For data packages, which does not contain any code, it seems weird to use a
> software license such as GPL or GPL-2.  It seems better to use something
> like Artistic-2.0 or one of the CC licenses.
>
> On Thu, Mar 3, 2016 at 5:15 PM, davide risso <risso.davide at gmail.com>
> wrote:
>
> > Hi Hervé and Sean,
> >
> > thanks for your help. It will indeed be interesting to hear how other
> > people chose the license, especially for those package that redistribute
> a
> > dataset not from their lab.
> >
> > I do have an experimental data package in Bioc, zebrafishRNASeq, but it's
> > an experiment from a collaborator and at the time I didn't pay much
> > attention on which license to use.
> > In this case, I'd like to redistribute data from different labs. I guess
> I
> > will contact the original authors at least as a courtesy.
> > But I'm still keen to hear opinions on which license(s) is appropriate
> for
> > experimental data sharing.
> >
> > Best,
> > davide
> >
> >
> >
> >
> > On Thu, Mar 3, 2016 at 12:50 PM Hervé Pagès <hpages at fredhutch.org>
> wrote:
> >
> > > Hi Davide,
> > >
> > > On 03/01/2016 02:25 PM, davide risso wrote:
> > > > Dear Bioc developers,
> > > >
> > > > I recently downloaded three publicly available single-cell RNA-seq
> > > datasets
> > > > from the NCBI GEO/SRA repository and created an R package with some
> > > > gene-level summaries (read counts and FPKMs).
> > > >
> > > > I'm currently using the package locally for my own tests, but I'm
> > > thinking
> > > > that this may be a useful resource for the community and thinking of
> > > > sharing it on github and eventually submit it to Bioconductor.
> > > >
> > > > I was not involved in any way with the original studies, and I'm
> > > wondering
> > > > what is the best practice in terms of license / data sharing. Since
> > there
> > > > are many experimental data packages in Bioconductor, I'm guessing
> that
> > > I'm
> > > > not the first person wondering about this.
> > > >
> > > >>From the NCBI website, I read (quote from
> > > > https://www.ncbi.nlm.nih.gov/home/about/policies.shtml):
> > > > Databases of molecular data on the NCBI Web site include such
> examples
> > as
> > > > nucleotide sequences (GenBank), protein sequences, macromolecular
> > > > structures, molecular variation, gene expression, and mapping data.
> > They
> > > > are designed to provide and encourage access within the scientific
> > > > community to sources of current and comprehensive information.
> > Therefore,
> > > > NCBI itself places no restrictions on the use or distribution of the
> > data
> > > > contained therein. Nor do we accept data when the submitter has
> > requested
> > > > restrictions on reuse or redistribution. However, some submitters of
> > the
> > > > original data (or the country of origin of such data) may claim
> patent,
> > > > copyright, or other intellectual property rights in all or a portion
> of
> > > the
> > > > data (that has been submitted). NCBI is not in a position to assess
> the
> > > > validity of such claims and since there is no transfer of rights from
> > > > submitters to NCBI, NCBI has no rights to transfer to a third party.
> > > > Therefore, NCBI cannot provide comment or unrestricted permission
> > > > concerning the use, copying, or distribution of the information
> > contained
> > > > in the molecular databases.
> > > >
> > > > Should I contact the original authors for permission? Or is the fact
> > that
> > > > the data were publicly shared enough to grant me permission to
> > > redistribute?
> > > > In that case, is there a standard license that I should use?
> > > >
> > > > Thanks for any feedback / thought!
> > >
> > > I don't have much to offer. AFAIK we don't really have guidelines or
> > > recommendations for what license to use for experimental data packages,
> > > except for the usual "make sure you use an appropriate license" advice.
> > > So far it has really been up to each author/maintainer to make sure
> > > they pick up a license that is compatible with the original
> > > license/copyright/patent of the original data they are packaging
> > > and with its redistribution thru the Bioconductor channel.
> > >
> > > FWIW here is a summary of the licenses used by the 276 experimental
> > > data packages currently in BioC devel:
> > >
> > >    License       Nb of packages
> > >    ------------  --------------
> > >    GPL                      135
> > >    Artistic-2.0              96
> > >    LGPL                      41
> > >    other                      4
> > >
> > > Would be interesting to hear from other developers about this. For
> > > example, how people choose between GPL vs Artistic-2.0? Is one
> > > license typically more appropriate for packaging and redistributing
> > > data that is already publicly available?
> > >
> > > H.
> > >
> > > >
> > > > Best,
> > > > davide
> > > >
> > > >       [[alternative HTML version deleted]]
> > > >
> > > > _______________________________________________
> > > > Bioc-devel at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> > > >
> > >
> > > --
> > > Hervé Pagès
> > >
> > > Program in Computational Biology
> > > Division of Public Health Sciences
> > > Fred Hutchinson Cancer Research Center
> > > 1100 Fairview Ave. N, M1-B514
> > > P.O. Box 19024
> > > Seattle, WA 98109-1024
> > >
> > > E-mail: hpages at fredhutch.org
> > > Phone:  (206) 667-5791
> > > Fax:    (206) 667-1319
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list