[Bioc-devel] A handful of check to follow up on R CMD BiocCheck
Kevin RUE
kevinrue67 at gmail.com
Wed Nov 2 23:00:19 CET 2016
Me again :)
Please find attached the first patch to print the first 6 lines over 80
characters long. (I'll get to the tabulation offenders next).
Note that all the offending lines are stored in the "df.length" data.frame.
How about an option like "fullReport=c(FALSE, TRUE)" that print *all* the
offending lines?
The data.frame also stores the content of the lines for the record, but
does not print them. I think Kasper is right: filename and line should be
enough to track down the line.
All the best,
Kevin
On Wed, Nov 2, 2016 at 8:08 PM, Kevin RUE <kevinrue67 at gmail.com> wrote:
> Thanks for the feedback!
>
> I also tend to prefer *all* the lines being reported (or to be honest,
> that was really true when I had lots of them; a problem that I largely
> mitigated by fixing all of them once and subsequently paying more attention
> while developing).
>
> Printing the content of the offending line somewhat helps me spot the line
> faster (more so for tab issues). But I must admit that showing the whole
> line is somewhat "overkill". I just started thinking of a compromise being
> to only show the first N characters of the line, with N being 80 minus the
> number of characters necessary to print the filename and line number.
>
> Thanks Martin for pointing out the lines in BiocCheck. (Now I feel bad for
> not having checked sooner.. hehe!)
> I think the idea of BiocCheck showing the first 6 offenders in BiocCheck
> quite nice, as I rarely have more since I use using the RStudio "Tools >
> Global Options > Code > Display > Show Margin > Margin column: 80" feature.
>
> I'll give a go at both approaches (developing BiocCheck and my own scripts)
>
> Cheers,
> Kevin
>
>
> On Wed, Nov 2, 2016 at 7:41 PM, Kasper Daniel Hansen <
> kasperdanielhansen at gmail.com> wrote:
>
>> I would prefer all line numbers reported, but on the other hand I am
>> indifferent wrt. the content of the line, unless (say) TABs are marked up
>> somehow.
>>
>> Kasper
>>
>> On Wed, Nov 2, 2016 at 3:17 PM, Martin Morgan <
>> martin.morgan at roswellpark.org> wrote:
>>
>>> On 11/02/2016 02:49 PM, Kevin RUE wrote:
>>>
>>>> Dear all,
>>>>
>>>> Just thought I'd share a handful of scripts that I wrote to follow up on
>>>> certain NOTE messages thrown by R CMD BiocCheck.
>>>>
>>>> https://github.com/kevinrue/BiocCheckTools
>>>>
>>>> They're very simple, but I occasionally find them quite convenient.
>>>> Apologies if something similar already exists somewhere :)
>>>>
>>>
>>> Maybe consider creating a diff against the source code that, e.g.,
>>> reported the first 6 offenders? The relevant lines are near
>>>
>>> https://github.com/Bioconductor-mirror/BiocCheck/blob/master
>>> /R/checks.R#L1081
>>>
>>> Martin
>>>
>>>
>>>> All the best,
>>>> Kevin
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>>
>>>
>>> This email message may contain legally privileged and/or...{{dropped:2}}
>>>
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>
-------------- next part --------------
diff --git a/R/checks.R b/R/checks.R
index f0b5844..9b1f273 100644
--- a/R/checks.R
+++ b/R/checks.R
@@ -1057,14 +1057,12 @@ checkFormatting <- function(pkgdir)
tablines <- 0L
badindentlines <- 0L
ok <- TRUE
-
- df.length <- data.frame(stringsAsFactors=FALSE)
- df.i <- 1
+
for (file in files)
{
- pkgname <- getPkgNameFromPkgDir(pkgdir)
if (file.exists(file) && file.info(file)$size == 0)
{
+ pkgname <- getPkgNameFromPkgDir(pkgdir)
handleNote(sprintf("Add content to the empty file %s.",
mungeName(file, pkgname)))
}
@@ -1074,23 +1072,14 @@ checkFormatting <- function(pkgdir)
lines <- readLines(file, warn=FALSE)
totallines <- totallines + length(lines)
n <- nchar(lines, allowNA=TRUE)
- n <- n[!is.na(n)]; lines <- lines[!is.na(n)]
+ n <- n[!is.na(n)]
names(n) <- seq_along(1:length(n))
- long <- which(n > 80)
-
+ long <- n[n > 80]
if (length(long))
{
## TODO/FIXME We could tell the user here which lines are long
## in which files.
- for (i in long){
- df.length[df.i,1] <- mungeName(file, pkgname) # filename
- df.length[df.i,2] <- names(n)[i] # line number
- df.length[df.i,3] <- lines[i] # content
- df.length[df.i,4] <- n[i] # length
- df.i <- df.i + 1
- }
-
longlines <- longlines + length(long)
}
@@ -1111,22 +1100,12 @@ checkFormatting <- function(pkgdir)
}
}
- colnames(df.length) <- c("File", "Line", "Content", "Length")
-
if (longlines > 0)
{
ok <- FALSE
- h.length <- head(df.length)
handleNote(sprintf(
"Consider shorter lines; %s lines (%i%%) are > 80 characters long.",
longlines, as.integer((longlines/totallines) * 100)))
- message(sprintf(" The first %i lines are:", nrow(h.length)))
- for (i in 1:nrow(h.length))
- {
- row <- h.length[i,]
- message(sprintf(" %s (line %s): %s characters",
- row$File, row$Line, row$Length))
- }
}
if (tablines > 0)
{
More information about the Bioc-devel
mailing list