[Rd] Bounty on Error Checking

Duncan Murdoch murdoch.duncan at gmail.com
Fri Jan 4 15:03:31 CET 2013


On 13-01-04 8:32 AM, Matthew Dowle wrote:
>
> On Fri, Jan 3, 2013, Bert Gunter wrote
>> Well...
>>
>> On Thu, Jan 3, 2013 at 10:00 AM, ivo welch <ivo.welch <at>
>> anderson.ucla.edu> wrote:
>>>
>>> Dear R developers---I just spent half a day debugging an R program,
>>> which had two bugs---I selected the wrongly named variable, which
>>> turns out to have been a scalar, which then happily multiplied as if
>>> it was a matrix; and another wrongly named variable from a data
>>> frame,
>>> that triggered no error when used as a[["name"]] or a$name .  there
>>> should be an option to turn on that throws an error inside R when
>>> one
>>> does this.  I cannot imagine that there is much code that wants to
>>> reference non-existing columns in data frames.
>>
>> But I can -- and do it all the time: To add a new variable, "d" to a
>> data frame, df,  containing only "a" and "b" (with 10 rows, say):
>>
>> df[["d"]] <- 1:10
>
> Yes but that's `[[<-`. Ivo was talking about `[[` and `$`; i.e., select
> only not assign, if I understood correctly.
>
>>
>> Trying to outguess documentation to create error triggers is a very
>> bad idea.
>
> Why exactly is it a very bad idea? (I don't necessarily disagree, just
> asking
> for more colour.)
>
>> R already has plenty of debugging tools -- and there is even a "debug"
>> package. Perhaps you need a better programming editor/IDE. There are
>> several listed on CRAN, RStudio, etc.
>
> True, but that relies on you knowing there's a bug to hunt for. What if
> you
> don't know you're getting incorrect results, silently? In a similar way
> that options(warn=2) turns known warnings into errors, to enable you to
> be
> more strict if you wish,

I would say the point of options(warn=2) is rather to let you find the 
location of the warning more easily, because it will abort the 
evaluation.  I would not recommend using code that issues warnings.

an option to turn on warnings from `[[` and
> `$`
> if the column is missing (select only, not assign) doesn't seem like a
> bad option to have. Maybe it would reveal some previously silent bugs.

I agree that this would sometimes be useful, but a very common 
convention is to do something like

if (is.null(obj$element)) {  do something }

These would all have to be re-written to something like

if (missing.field(obj, "element") { do something }

There are several hundred examples of the first usage in base R; I 
imagine thousands more in contributed packages.  I don't think the 
benefit of the change is worth all the work that would be necessary to 
implement it.

Duncan

>
> Anyway, I'm hoping Ivo will let us know if he likes the simple mask I
> proposed, or not. That's already an option that can be turned on or
> off.
> But if his bug was selecting the wrong column, not a missing one, then
> I'm not sure anything could (or needs to be) done about that.
>
> Matthew
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list