[Rd] var() with 0-length vector -- docs inconsistent with result

Martin Maechler m@echler @ending from @t@t@m@th@ethz@ch
Wed Sep 12 11:50:58 CEST 2018


>>>>> Raubertas, Richard via R-devel 
>>>>>     on Tue, 11 Sep 2018 18:52:55 +0000 writes:

    > R 3.5.1 on Windows 7 The documentation for 'var' says:

    > "These functions return 'NA' when there is only one
    > observation (whereas S-PLUS has been returning 'NaN'), and
    > fail if 'x' has length zero."  


Well, that help says much more, notably the paragraph
immediately before the sentence you cite ends saying

     Note that (the equivalent of) ‘var(double(0), use = *)’ gives ‘NA’
     for ‘use = "everything"’ and ‘"na.or.complete"’, and gives an
     error in the other cases.

which is true.

Thank you, Richard, for the report.
The current docs are indeed easily misleading here.
I think that just erasing the ending half-sentence

 " , and fail if 'x' has length zero. "  

should do.

    > The function 'sd' (based on 'var') has similar documentation.

indeed... and "much worse", it says

  The standard deviation of a zero-length vector (after removal of
  ‘NA’s if ‘na.rm = TRUE’) is not defined and gives an error.  

I propose also just amend the docu there, and do not change
the code (as you Richard also seem favor).
After all,  `NA` is also pretty close to  "not defined", and in that sense valid.

Martin

    > However, I get:
    >  > var(numeric(0))
    >  [1] NA

    > rather than an error.

    > Personally I prefer that basic summary functions like
    > 'var' not throw errors even in corner cases.  But either
    > way, the result and the docs are inconsistent.

    > Richard Raubertas



More information about the R-devel mailing list