[Rd] What to do with a inconsistency in rank() that's in S+ and R ever since?

Jens Oehlschlägel joehl at web.de
Fri Oct 27 11:14:25 CEST 2006


Dear R-developers,

I just realized that rank() behaves inconsistent if combining one of na.last in {TRUE|FALSE} with a ties.method in {"average"|"random"|"max"|"min"}.
The documentation suggests that e.g. with na.last=TRUE NAs are treated like the last (=highest) value, which obviously is not the case:

> rank(c(1,2,2,NA,NA), na.last = TRUE, ties.method = c("average", "first", "random", "max", "min")[1])
[1] 1.0 2.5 2.5 4.0 5.0

I'd expect 

[1] 1.0 2.5 2.5 4.5 4.5

rather, but in fact NAs seem to be always treated ties.method = "first". I have no idea in which situation one could desire e.g. ties.method = "average" except for NAs!?

I am aware that the prototype behaves like this and R ever since behaves like this, however to me this appears very unfortunate. In order not to 'break' existing code, what about adding ties.methods {"NAaverage"|"NArandom"|"NAmax"|"NAmin"} that behave consistently? 

Best regards


Jens Oehlschlägel


P.S. Please cc. me, I am not on the list


> version
               _                           
platform       i386-pc-mingw32             
arch           i386                        
os             mingw32                     
system         i386, mingw32               
status                                     
major          2                           
minor          4.0                         
year           2006                        
month          10                          
day            03                          
svn rev        39566                       
language       R                           
version.string R version 2.4.0 (2006-10-03)




More information about the R-devel mailing list