[R] min(NA,"bla") != min("bla", NA)

Duncan Murdoch murdoch.duncan at gmail.com
Fri Sep 27 03:38:11 CEST 2013


On 13-09-26 9:26 PM, Ista Zahn wrote:
> On Thu, Sep 26, 2013 at 9:10 PM, Rolf Turner <rolf.turner at xtra.co.nz> wrote:
>> On 09/27/13 11:07, Duncan Murdoch wrote:
>>>
>>> On 13-09-26 5:32 PM, Rolf Turner wrote:
>>>>
>>>>
>>>> Just to add to the confusion, on my system I get NA --- which I
>>>> understand to be
>>>> the correct value --- from all of min(NA,"bla"), min("bla",NA),
>>>> min(c(NA,"bla")), and
>>>> min(c("bla",NA)).  When I append the argument na.rm=TRUE to each of the
>>>> calls,
>>>> I get "bla" from each.
>>>>
>>>> So, no bug in my system.
>>>
>>>
>>> I've just built a 3.0.1 version, and I definitely see the bug there. What
>>> do you get from these expressions?
>>>
>>> str(min(NA, "bla"))
>>> str(min("bla", NA))
>>> str(min(NA_character_, "bla"))
>>>
>>> I get
>>>
>>>> str(min(NA, "bla"))
>>>   int NA
>>>> str(min("bla", NA))
>>>   chr "bla"
>>>> str(min(NA_character_, "bla"))
>>>   chr "bla"
>>>
>>> on both Windows and OSX R 3.0.1.  After today's patch, I get
>>>
>>>   chr NA
>>>
>>> for all three.
>>
>>
>> I get:
>>
>>> str(min(NA, "bla"))
>>   int NA
>>   > str(min("bla", NA))
>>   chr NA
>>> str(min(NA_character_, "bla"))
>>   chr NA
>>
>> Which looks to me to be as it should be.  How come my system's so good
>> compared to others? :-)
>
> It's not; you just have your locale set differently;

Nice catch.  The good news is that min() works in the C locale; the bad 
news is that max() doesn't.  But both work after today's patch.

Duncan Murdoch

>
>> Sys.getlocale()
> [1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"
>> for(l in letters) print(min(l, NA))
> [1] "a"
> [1] "b"
> [1] "c"
> [1] "d"
> [1] "e"
> [1] "f"
> [1] "g"
> [1] "h"
> [1] "i"
> [1] "j"
> [1] "k"
> [1] "l"
> [1] "m"
> [1] "n"
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
>> Sys.setlocale("LC_ALL", "C")
> [1] "LC_CTYPE=C;LC_NUMERIC=C;LC_TIME=C;LC_COLLATE=C;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"
>> for(l in letters) print(min(l, NA))
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
> [1] NA
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=C                 LC_NUMERIC=C
>   [3] LC_TIME=C                  LC_COLLATE=C
>   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> Best,
> Ista
>
>>
>>      cheers,
>>
>>      Rolf
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list