[R] Change in behaviour of sd()

Tue Jul 8 23:46:28 CEST 2008

On 08/07/2008 5:01 PM, Rolf Turner wrote:
> On 8/07/2008, at 7:38 PM, Fiona Johnson wrote:
> 
>> Hi
>>
>> I have just upgraded from R2.6.0 to R2.7.1 (running on Windows) and  
>> a part
>> of my code that previously ran ok now gives an error. The following  
>> is a
>> simple example to demonstrate my problem.
>>
>>> a <- array(c(1,2,3,4,5,6,rep(NA,6)),dim=c(6,2))
>>> apply(a,2,sd,na.rm=T)
>> In R2.6.0 this gives (which is what I would like)
>>
>>  [1] 1.870829       NA
>>
>> In R2.7.1 it gives the following error
>>
>> "Error in var(x, na.rm = na.rm) : no complete element pairs"
>>
>> As my columns are always either all NA or all numbers, I could get  
>> around it
>> by replacing the NA's with 0's but if someone could shed some light  
>> on why
>> the behaviour has changed in the new version or a better work  
>> around it
>> would be much appreciated. I want to keep the columns of NA's because
>> ultimately I am plotting the results with contour and the NA's  
>> refer to grid
>> cells not on land where I don't want to have contours.
> 
> I just scanned through the release announcements (from Peter  
> Dalgaard) about new
> versions of R (R home page --> What's new? --> Archive of important  
> announcements)
> and found nothing about new behaviour for sd/var/cov.  So I cannot  
> contribute
> to enlightenment about ``why''.  

This is the relevant but not so obvious NEWS entry:

     o	co[rv](use = "complete.obs") now always gives an error if there
	are no complete cases: they used to give NA if
	method = "pearson" but an error for the other two methods.  (Note
	that this is pretty arbitrary, but zero-length vectors always
	give an error so it is at least consistent.)

	cor(use="pair") used to give diagonal 1 even if the variable
	was completely missing for the rank methods but NA for the
	Pearson method: it now gives NA in all cases.

(sd calls var, which calls cov internally.)

Duncan Murdoch

 >
The following function might provide
> a suitable
> workaround:
> 
> my.sd <- function (x, na.rm = FALSE)
> {
>      if (is.matrix(x))
>          apply(x, 2, my.sd, na.rm = na.rm)
>      else if (is.vector(x)) {
>          if(na.rm) x <- x[!is.na(x)]
>          if(length(x) == 0) return(NA)
>          sqrt(var(x, na.rm = na.rm))
>      }
>      else if (is.data.frame(x))
>          sapply(x, my.sd, na.rm = na.rm)
>      else {
>          x <- as.vector(x)
>          my.sd(x,na.rm=na.rm)
>      }
> }
> 
> It seems to work on your toy example at least.  Note that my.sd 
> (numeric(0)) returns NA,
> but my.sd(NULL) throws an error.
> 
> 	cheers,
> 
> 		Rolf Turner
> 
> 
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.