[Bioc-devel] ddply causes error during R check

Martin Morgan mtmorg@n@b|oc @end|ng |rom gm@||@com
Tue Feb 12 14:58:37 CET 2019


use `globalVariables()` to declare these symbols and quieten warnings, at the expense of quietening warnings about undefined variables in _all_ code and potentially silencing true positives. Avoid non-standard evaluation (this is what ddply is doing, using special rules to resolve symbols like `name`) by using base R functionality; note also that non-standard evaluation is prone to typos, e.g., looking for the typo `hpx` in the calling environment rather than the data frame

> hpx = 1
> ddply(mtcars, "cyl", "summarize", value = mean(hpx)).  ## oops, meant `mean(hp)`.
  cyl summarize
1   4         1
2   6         1
3   8         1

Marginally better is

> aggregate(hp ~ cyl, mtcars, mean)
  cyl        hp
1   4  82.63636
2   6 122.28571
3   8 209.21429

where R recognizes symbols in the formula ~ as intentionally unresolved. The wizards on the list might point to constructs in the rlang package.

Martin

On 2/12/19, 2:35 AM, "Bioc-devel on behalf of web working" <bioc-devel-bounces using r-project.org on behalf of webworking using posteo.de> wrote:

    Hi,
    
    I am developing a Bioconductor package and can not get rid of some 
    warning messages. During devtools::check() I get the following warning 
    messages:
    
    ...
    summarizeDataFrame: no visible binding for global variable ‘name’
    summarizeDataFrame: no visible binding for global variable ‘gene’
    summarizeDataFrame: no visible binding for global variable ‘value’
    ...
    
    Here a short version of the function:
    
    #' Collapse rows with duplicated name column
    #'
    #' @param dat a \cite{tibble} with the columns name, gene and value
    #' @importFrom plyr ddply
    #' @import tibble
    #' @return a \cite{tibble}
    #' @export
    #'
    #' @examples
    #' dat <- tibble(name = c(paste0("position", 1:5), paste0("position", 
    c(1:3))), gene = paste0("gene", 1:8), value = 1:8)
    #' summarizeDataFrame(dat)
    summarizeDataFrame <- function(dat){
       ddply(dat, "name", "summarize",
             name=unique(name),
             gene=paste(unique(gene), collapse = ","),
             value=mean(value))
    }
    
    R interprets the "name", "gene" and "value" column names as variables 
    during the check. Does anyone has an idea how to change the syntax of 
    ddply or how to get rid of the warning message?
    
    Thanks in advance!
    
    Tobias
    
    _______________________________________________
    Bioc-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    


More information about the Bioc-devel mailing list