[R] Strplit code

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Thu Dec 4 13:28:50 CET 2008


John Fox wrote:
> Dear Wacek,
>
> "Wrong" is a bit strong, I think -- limited to single-pattern characters is
> more accurate. 

nothing is ever wrong if seen from an appropriate perspective.  for
example, there is nothing wrong in that many core functions in r deparse
some, but not all, of the argument expressions, without any obvious
pattern -- when you get used to it and learn each single case by heart,
it's perfectly correct.


> Moreover, it isn't hard to make the function work with
> multiple-character matches as well:
>   

which you probably should have done before posting the flawed version.

> Strsplit <- function(x, split){
>     if (length(x) > 1) {
>         return(lapply(x, Strsplit, split))  # vectorization
>         }
>     result <- character(0)
>     if (nchar(x) == 0) return(result)
>     posn <- regexpr(split, x)
>     if (posn <= 0) return(x)
>     c(result, substring(x, 1, posn - 1), 
>         Recall(substring(x, posn + attr(posn, "match.length"), 
>           nchar(x)), split))  # recursion
>     }
>
> On the other hand, your function is much more efficient.
>   

just one order of magnitude in my tests.  might not be completely fool
proof, though. 

vQ



More information about the R-help mailing list