[R] matching-case sensitivity

Spencer Graves spencer.graves at pdf.com
Tue Aug 26 23:50:11 CEST 2003


Alternatively, you could use "casefold".  This would make your code more 
compatible with S-Plus.  For me, "toupper" and "tolower" are easier 
names to remember and easier to read.  However, if you think that 
someone might want to try using your code with S-Plus, then "casefold" 
might be the better choice.

hope this helps.  spencer graves

Marc Schwartz wrote:
> On Tue, 2003-08-26 at 15:09, Jablonsky, Nikita wrote:
> 
>>Hi All,
>>
>>I am trying to match two character arrays (email lists) using either
>>pmatch(), match() or charmatch() functions. However the function is
>>"missing" some matches due to differences in the cases of some letters
>>between the two arrays. Is there any way to disable case sensitivity or is
>>there an entirely better way to match two character arrays that have
>>identical entries but written in different case?
>>
>>Thanks
>>Nikita
> 
> 
> 
> At least two options for case insensitive matching:
> 
> 1. use grep(), which has an 'ignore.case' argument that you can set to
> TRUE. See ?grep
> 
> 2. use the function toupper() to convert both character vectors to all
> upper case. See ?toupper.  Conversely, tolower() would do the opposite.
> 
> 
> A quick solution using the second option would be:
> 
> Vector1[toupper(Vector1) %in% toupper(Vector2)]
> 
> which would return the elements that match in both vectors.
> 
> 
> A more formal example with some data:
> 
> Vector1 <- letters[1:10]
> Vector1
> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
> 
> 
> Vector2 <- c(toupper(letters[5:8]), letters[9:15])
> Vector2
> [1] "E" "F" "G" "H" "i" "j" "k" "l" "m" "n" "o"
> 
> 
> Vector1[toupper(Vector1) %in% toupper(Vector2)]
> [1] "e" "f" "g" "h" "i" "j"
> 
> 
> HTH,
> 
> Marc Schwartz
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list