[R] Change in behaviour of sd()

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Jul 9 00:48:26 CEST 2008


Duncan Murdoch wrote:
> On 08/07/2008 5:01 PM, Rolf Turner wrote:
>> On 8/07/2008, at 7:38 PM, Fiona Johnson wrote:
>>
>>> Hi
>>>
>>> I have just upgraded from R2.6.0 to R2.7.1 (running on Windows) and  
>>> a part
>>> of my code that previously ran ok now gives an error. The following  
>>> is a
>>> simple example to demonstrate my problem.
>>>
>>>> a <- array(c(1,2,3,4,5,6,rep(NA,6)),dim=c(6,2))
>>>> apply(a,2,sd,na.rm=T)
>>> In R2.6.0 this gives (which is what I would like)
>>>
>>>  [1] 1.870829       NA
>>>
>>> In R2.7.1 it gives the following error
>>>
>>> "Error in var(x, na.rm = na.rm) : no complete element pairs"
>>>
>>> As my columns are always either all NA or all numbers, I could get  
>>> around it
>>> by replacing the NA's with 0's but if someone could shed some light  
>>> on why
>>> the behaviour has changed in the new version or a better work  
>>> around it
>>> would be much appreciated. I want to keep the columns of NA's because
>>> ultimately I am plotting the results with contour and the NA's  
>>> refer to grid
>>> cells not on land where I don't want to have contours.
>>
>> I just scanned through the release announcements (from Peter  
>> Dalgaard) about new
>> versions of R (R home page --> What's new? --> Archive of important  
>> announcements)
>> and found nothing about new behaviour for sd/var/cov.  So I cannot  
>> contribute
>> to enlightenment about ``why''.  
>
> This is the relevant but not so obvious NEWS entry:
>
>     o    co[rv](use = "complete.obs") now always gives an error if there
>     are no complete cases: they used to give NA if
>     method = "pearson" but an error for the other two methods.  (Note
>     that this is pretty arbitrary, but zero-length vectors always
>     give an error so it is at least consistent.)
>
>     cor(use="pair") used to give diagonal 1 even if the variable
>     was completely missing for the rank methods but NA for the
>     Pearson method: it now gives NA in all cases.
>
> (sd calls var, which calls cov internally.)
>
Thanks. I was searching for it as well... (BTW, Rolf, I'm not the 
oracle, I just copy parts of the NEWS file into release announcements. 
The file itself is readily available, as others point out.)

I think this has come up before but I forgot the details.

It _is_ a bit odd that the variance of an empty vector is an error but 
that of a one-element vector is NA, and the difference var(NA,na.rm=T) 
and var(numeric(0)) is not too clear either.

Also,

 > var(NA,use="pair")
[1] NA
 > var(NA,use="co")
Error in var(NA, use = "co") : no complete element pairs


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list