[Rd] no visible binding for global variable for data sets in a package

Michael Friendly friendly at yorku.ca
Wed Aug 27 15:29:57 CEST 2014


On 8/27/2014 5:24 AM, Martin Maechler wrote:
>>>>>> Michael Friendly <friendly at yorku.ca>
>>>>>>      on Tue, 26 Aug 2014 17:58:34 -0400 writes:
>      > I'm updating the Lahman package of baseball statistics to the 2013
>      > release. In addition to
>      > the main data sets, the package also contains several convenience
>      > functions that make use
>      > of these data sets.  These now trigger the notes below from R CMD check
>      > run with
>      > Win builder, R-devel.  How can I avoid these?
>
>      > * using R Under development (unstable) (2014-08-25 r66471)
>      > * using platform: x86_64-w64-mingw32 (64-bit)
>      > ...
>      > * checking R code for possible problems ... NOTE
>      > Label: no visible binding for global variable 'battingLabels'
>      > Label: no visible binding for global variable 'pitchingLabels'
>      > Label: no visible binding for global variable 'fieldingLabels'
>      > battingStats: no visible binding for global variable 'Batting'
>      > battingStats: no visible global function definition for 'mutate'
>      > playerInfo: no visible binding for global variable 'Master'
>      > teamInfo: no visible binding for global variable 'Teams'
>
>      > One such function:
>
>      > ## function for accessing variable labels
>
>      > Label <- function(var, labels=rbind(battingLabels, pitchingLabels,
>      > fieldingLabels)) {
>      > wanted <- which(labels[,1]==var)
>      > if (length(wanted)) labels[wanted[1],2] else var
>      > }
>
> and you are using the data sets you mentioned before,
> (and the checking has been changed recently here).
>
> This is a bit subtle:
> Your data sets are part of your package (thanks to the default
> lazyData), but *not* part of the namespace of your package.
> Now, the reasoning goes as following: if someone uses a function
> from your package, say Label() above,
> by
> 	Lahman::Label(..)
> and your package has not been attached to the search path,
> your user will get an error, as the datasets are not found by
> Label().
>
> If you consider something like   Lahman::Label(..)
> for a bit and the emphasis we put on R functions being the
> primary entities, you can understand the current, i.e. new,
> R CMD check warnings.
Thanks for this explicit explanation.  Now I understand why this occurs.
>
> I see the following two options for you:
>
> 1) export all these data sets from your NAMESPACE
>     For this (I thinK), you must define them in  Lahman/R/ possibly via a
>     Lahman/R/sysdata.rda
Not sure I quite understand how this would work.  My NAMESPACE currently 
exports
the few functions in this package:

# all the rest is data
export(battingStats,
     playerInfo,teamInfo,
     Label
     )

Do you mean to simply add all the data sets ('globals')  that are 
referred to in these functions?

# all the rest is data
export(battingStats,
     playerInfo,teamInfo,
     Label,
     battingLabels, pitchingLabels, fieldingLabels,
     Batting, Master, Teams
     )

That seems a bit odd.  Can you actually export data?  Maybe there is a 
need for a
separate NAMESPACE declaration, that might be called either of

exportdata()
globaldata()

>
> 2) rewrite your functions such that ensure the data sets are
>     loaded when they are used.
>
>
> "2)" actually works by adding
>     	stopifnot(require(Lahman, quietly=TRUE))
>    as first line in Label() and other such functions.
>
> It works in the sense that  Lahman::Label("yearID")  will
> work even when Lahman is not in the search path,
> but   R-devel CMD check   will still give the same NOTE,
> though you can argue that that note is actally a "false positive".
So, this would be version 1 of "2)":

Label <- function(var, labels) {
     stopifnot(require(Lahman, quietly=TRUE))
     if(missing(labels)) labels <- rbind(battingLabels, pitchingLabels, 
fieldingLabels)
     wanted <- which(labels[,1]==var)
     if (length(wanted)) labels[wanted[1],2] else var
}

And this would be version 2, using data():

Label <- function(var, labels) {
     stopifnot(require(Lahman, quietly=TRUE))
     if(missing(labels)) {
         data(battingLabels); data(pitchingLabels); data(fieldingLabels)
         labels <- rbind(battingLabels, pitchingLabels, fieldingLabels)
         }
     wanted <- which(labels[,1]==var)
     if (length(wanted)) labels[wanted[1],2] else var
}


>     
> Not sure about another elegant way to make "2)" work, apart from
> using  data() on each of the datasets inside the
> function.  As I haven't tried it, that may *still* give a
> (false) NOTE..
>
> This is a somewhat interesting problem, and I wonder if everyone
> else has solved it with '1)' rather than a version of '2)'.
>
> Martin
>     


-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA



More information about the R-devel mailing list