[R] String comparison, trailing blanks make a difference.

Hadley Wickham h.wickham at gmail.com
Sat Jul 19 21:19:42 CEST 2014


If you have unicode strings, you may need to do even more because
there are often multiple ways of representing the same glyph. I made a
little demo at http://rpubs.com/hadley/unicode-normalisation, since
any unicode characters are likely to get mangled by email.

Hadley

On Fri, Jul 18, 2014 at 11:32 AM, William Dunlap <wdunlap at tibco.com> wrote:
>>>"abc" == "abc "
>> [1] FALSE
>
> R does no interpretation of strings when doing comparisons so you do
> have do your own canonicalization.  That may involve removing
> trailing, leading, or all white space or punctuation, converting to
> lower or upper case, mapping nicknames to official names, trimming to
> a fixed number of characters, etc.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Fri, Jul 18, 2014 at 9:17 AM, John McKown
> <john.archie.mckown at gmail.com> wrote:
>> Well, this was a shock to me. And I don't really see any documentation
>> about it, but perhaps I just can't see it.
>>
>>>"abc" == "abc "
>> [1] FALSE
>>
>> I guess that I thought of strings in R like I do is some other
>> languages where the shorter value is padded with blanks to the length
>> of the longer value, then compared. I.e. that trailing blanks didn't
>> matter.
>>
>> The best solution that I have found is to use the str_trim() function
>> from the stringr to remove all the trailing blanks after I get the
>> data from the SQL data base. I cannot change the SQL schema to make
>> the column a varchar instead of a char column. It is a vendor DB. And
>> I don't know an ANSI SQL standard way to remove trailing blanks in the
>> SELECT command. PostgreSQL has a "trim(trailing ' ' from column)', but
>> MS-SQL upchucks on that syntax.
>>
>> --
>> There is nothing more pleasant than traveling and meeting new people!
>> Genghis Khan
>>
>> Maranatha! <><
>> John McKown
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
http://had.co.nz/



More information about the R-help mailing list