[R] splitting strings effriciently

Enrico Schumann enricoschumann at yahoo.de
Sun Jan 8 14:11:56 CET 2012


Hi Andrew,

you can use strsplit for a character vector; you do not have to call it 
for every element data$ComputerName[i].

If I understand correctly, maybe something like this helps

 > ip <- "123.456.789.321"  ## example data
 > df <- data.frame(ip = rep(ip, 9), stringsAsFactors=FALSE)
 > df
                ip
1 123.456.789.321
2 123.456.789.321
3 123.456.789.321
4 123.456.789.321
5 123.456.789.321
6 123.456.789.321
7 123.456.789.321
8 123.456.789.321
9 123.456.789.321

 >
 > res <- unlist(strsplit(df[["ip"]], "\\."))
 > ii <- seq(1, nrow(df)*4, by = 4)
 > res[ii]   ## A
[1] "123" "123" "123" "123" "123" "123" "123"
[8] "123" "123"
 > res[ii+1] ## B
[1] "456" "456" "456" "456" "456" "456" "456"
[8] "456" "456"
 > res[ii+2] ## C
[1] "789" "789" "789" "789" "789" "789" "789"
[8] "789" "789"
 > res[ii+3] ## D
[1] "321" "321" "321" "321" "321" "321" "321"
[8] "321" "321"


Regards,
Enrico


Am 08.01.2012 11:06, schrieb Andrew Roberts:
> Folks,
>
> I have a data frame with 4861469 rows that contains an ip address
> xxx.xxx.xxx.xxx as one of the columns. I want to assign a site to each
> row based on IP ranges. To do this I have a function to split the ip
> address as character into class A,B,C and D components. It works but is
> horribly inefficient in terms of speed. I can't quite see how one of the
> l/s/m/t/apply functions could be brought to bear on the problem. Does
> anyone have any thoughts?
>
> for(i in 1:4861469)
>     {
>     lst<-unlist(strsplit(data$ComputerName[i], "\\."))
>     data$IPA[i]<-lst[[1]]
>     data$IPB[i]<-lst[[2]]
>     data$IPC[i]<-lst[[3]]
>     data$IPD[i]<-lst[[4]]
>     rm(lst)
>     }
>
> Andrew
>
> Andrew Roberts
> Children's Orthopaedic Surgeon
> RJAH, Oswestry, UK
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Enrico Schumann
Lucerne, Switzerland
http://nmof.net/



More information about the R-help mailing list