[Rd] R CMD check does not checks for superfluous documentation (PR#7231)

Kurt Hornik Kurt.Hornik at wu-wien.ac.at
Fri Sep 17 11:38:53 CEST 2004


>>>>> wolski  writes:

> Hi!

> Due to package maintenance I have removed some functions but forgot to
> update the coresponding Rd files.  If R CMD check is checking for
> missing documentation entries why it does not check for documentation
> entries which tell the user about nonexisting funcitons?

It can do all these things, but we can have topics which do not have
corresponding functions, nor even R objects.

You can use

library(tools)

classifyRdTopics <-
function(package = "base", lib.loc = NULL)
{
    ## Classify Rd topics in an installed package according to whether
    ## they correspond to an object in the code or not.

    ## Useful output via
    ##   summary(classifyRdTopics())
    
    dir <- .find.package(package, lib.loc)
    isBase <- basename(dir) == "base"
    if(!isBase) tools:::.loadPackageQuietly(package, lib.loc)
    codeEnv <-
        as.environment(paste("package", package, sep = ":"))
    contents <- .readRDS(file.path(dir, "Meta", "Rd.rds"))
    keywords <- contents[ , "Keywords"]
    ## Disregard data sets docs.
    idx <- keywords != "datasets"
    ## Also, just to make sure, disregard everything marked as internal.
    idx <- idx & is.na(sapply(keywords,
                              function(x) match("internal", x)))
    ## Should really also ignore defunct and maybe also deprecated stuff
    ## when computing on base.
    contents <- contents[idx, , drop = FALSE]
    aliases <- unlist(contents$Aliases)
    topicIsAnObject <-
        aliases %in% objects(envir = codeEnv, all.names = TRUE)
    x <- list(aliases = aliases,
              topicIsAnObject = topicIsAnObject)
    class(x) <- "classifyRdTopics"
    x
}

summary.classifyRdTopics <- function(x) {
    x <- list(table(x$topicIsAnObject),
              x$aliases[!x$topicIsAnObject])
    class(x) <- "summary.classifyRdTopics"
    x
}
print.summary.classifyRdTopics <- function(x, ...) {
    writeLines("Numbers of topics which are objects (or not):")
    print(c(x[[1]]), ...)
    writeLines(c("", "Topics which are not objects:"))
    print(x[[2]], ...)
    invisible(x)
}

and then e.g. get

R> summary(classifyRdTopics("base"))
Numbers of topics which are objects (or not):
FALSE  TRUE 
   90   980 

Topics which are not objects:
 [1] "Arithmetic"         "AsIs"               "bessel"            
 [4] "Bessel"             "Comparison"         "Control"           
 [7] "else"               "DateTimeClasses"    "POSIXct"           
[10] "POSIXlt"            "POSIXt"             "Math.POSIXlt"      
[13] "Date"               "Dates"              "Defunct"           
[16] "Deprecated"         "Extract"            "Subscript"         
[19] "Foreign"            "Logic"              "Memory-limits"     
[22] "Memory"             "NA"                 "NULL"              
[25] "Paren"              "Random.user"        ".Random.seed"      
[28] "RNG"                "Rdconv"             "Rd2txt"            
[31] "Rd2dvi"             "Sd2Rd"              "Special"           
[34] "Startup"            "Rprofile"           ".Rprofile"         
[37] "Rprofile.site"      "Renviron.site"      ".Renviron"         
[40] ".First"             "Syntax"             "Trig"              
[43] "S3Methods"          "fuzzy matching"     "->"                
[46] "->>"                ".Autoloaded"        "base-deprecated"   
[49] "connections"        "connection"         "copyright"         
[52] "copyrights"         "files"              ".Method"           
[55] ".Generic"           ".Group"             ".Class"            
[58] "Math"               "Ops"                "Summary"           
[61] "Arith"              "Compare"            "Complex"           
[64] "Math2"              "group generic"      "Inf"               
[67] "NaN"                "R_LIBS"             ".First.lib"        
[70] ".Last.lib"          "localeconv"         "locales"           
[73] "TRUE"               "FALSE"              "matmult"           
[76] "name"               "NotYetImplemented"  "NotYetUsed"        
[79] ".onLoad"            ".onUnload"          ".onAttach"         
[82] "print.htest"        ".Last"              "regex"             
[85] "regexp"             "regular expression" "tilde"             
[88] ".Traceback"         "InternalMethods"    "Signals"           

to assess the magnitude of the issue at hand: about 10% of the base
topics do *not* corresponding to R objects.

Admittedly, some of these are perhaps not quite right, or better made
into \concept entries, but e.g. for a documentation object explaining R
CMD stuff, there will most likely never be an R object to use as topic.

More long term, I am planning to include ways to report "possible"
problems such as the above, perhaps alongside with us integrating Luke's
code analysis tools which face a similar issue.  But I don't think we
can peruse R CMD check to generate warnings about possible problems by
default.

[It detects enough real problems these days which package maintainers
are not always quick to fix ...]

-k


> In my opinion checking for documentation entries that document
> non-existing functions is much more important than the other way
> around.  To type by accident the name of an undocumented functions is
> not really a danger especially if it is not exported.  But to tell the
> user that there is a function if there are non ...

> By looking on the tools::undoc function and at the check(.pl) file it
> seems to me that all info is available to output also

> all_doc_topics %w/o% code_objs




> Yours Eryk.

> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list