[R] splitting strings effriciently
drflxms
drflxms at googlemail.com
Mon Jan 9 00:56:39 CET 2012
Hi Andrew,
I am aware, that this is an R-mailing list, but for such tasks (I deal a
lot with huge genomic datasets) I tend to use awk and sed for
preprocessing of data, in case I run into performance problems.
Otherwise for handling of strings in R I recommend stringr library, but
I don't know about it's performance...
Felix
> Folks,
>
> I have a data frame with 4861469 rows that contains an ip address
> xxx.xxx.xxx.xxx as one of the columns. I want to assign a site to each
> row based on IP ranges. To do this I have a function to split the ip
> address as character into class A,B,C and D components. It works but is
> horribly inefficient in terms of speed. I can't quite see how one of the
> l/s/m/t/apply functions could be brought to bear on the problem. Does
> anyone have any thoughts?
>
> for(i in 1:4861469)
> {
> lst <-unlist(strsplit(data$ComputerName[i], "\\."))
> data$IPA[i] <-lst[[1]]
> data$IPB[i] <-lst[[2]]
> data$IPC[i] <-lst[[3]]
> data$IPD[i] <-lst[[4]]
> rm(lst)
> }
>
> Andrew
>
> Andrew Roberts
> Children's Orthopaedic Surgeon
> RJAH, Oswestry, UK
More information about the R-help
mailing list