[Rd] R datasets ownership(copyright) and license

Yaroslav Halchenko yarikoptic at gmail.com
Tue Apr 3 00:06:14 CEST 2012


Dear R Developers,

Recently filed (and dismissed ;) ) law suit by Astrolabe against tz
database developers caused a lot of media-press and discussions and
created some kind of precedence in the USA [3].  But also it imho showed
that similar attacks might happen in the future, and possibly against
data sets which are not that obviously "factual" thus after all might
fall under copyright or IP protection if not in the states then in
some other jurisdictions.

And 'data copyright/license' question comes over and over again, I just
wanted to ask based on  what policies or advisories datasets were
selected to be shipped with R.   From a very very brief look at the
datasets, many of them appear to be factual data, thus at least at the
moment probably are not copyrightable in the states -- but is there
guarantee that they are not protected by copyright elsewhere if their
origin abroad?   But some seems to come from published works (still)
under copyright with "All rights reserved", e.g. datasets Harman23
and Harman74 [4].

Although similar question to mine was raised before [e.g. 1,2] I
have not found a straight answer e.g. from a list above or a mix of
them:

1. we simply did not look into it and adopted them with idea that if
   someone complains -- we remove corresponding pieces

2. we considered all datasets factual data thus not copyrightable (in
   USA? around the globe?)

3. for each (or some or majority) dataset we did collected information
   on possible copyright+license/IP holder and contacted them where
   unclear about the permission for reuse in a project under GPL license

Thank you in advance for the clarification!

P.S. Please do not take me wrong -- I am not trying to pick at
anyone.  I just wanted to get a better sense on the
procedures/assumptions R developers use while adopting data for the R
package, so that it could be of help for other projects.

[1] https://stat.ethz.ch/pipermail/r-help/2007-April/130422.html
[2] http://www.mail-archive.com/r-help@r-project.org/msg62486.html
[3] http://en.wikipedia.org/wiki/Tz_database
[4] it is interesting there that actual data comes from "unpublished PhD
    thesis", but once again from the U of Chicago who holds copyright
    for the book itself.

-- 
Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik



More information about the R-devel mailing list