[R] please comment on my function
jim holtman
jholtman at gmail.com
Sat Sep 15 02:06:33 CEST 2012
You can alway convert to lower case afterwards with probably a shorter
vector. You did not indicate that you needed that conversion; it only
looked like you did it for the regular expression.
On Fri, Sep 14, 2012 at 3:13 PM, Sam Steingold <sds at gnu.org> wrote:
>> * jim holtman <wubygzna at tznvy.pbz> [2012-09-14 13:10:37 -0400]:
>>
>> more than half the time is in 'tolower' and 'nchar', so it is not all
>> 'sub's problem.
>
> aha, thanks!
>
>> This version runs a little faster since it does not need the 'tolower':
>>
>> canonicalize.language <- function (s) {
>> # s <- tolower(s)
>> long <- nchar(s) == 5
>> s[long] <- sub("^([[:alpha:]]{2})[-_][[:alpha:]]{2}$","\\1",s[long])
>> s[nchar(s) != 2 & s != "c"] <- "unknown"
>> s
>> }
>
> but it does not convert "EN" to "en", so it is not good for my purposes.
>
> --
> Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
> http://www.childpsy.net/ http://thereligionofpeace.com http://mideasttruth.com
> http://iris.org.il http://honestreporting.com http://memri.org
> Life is like Tetris: failures accumulate, successes fade.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list