[R] R help

arun smartpink111 at yahoo.com
Tue Feb 11 17:51:22 CET 2014


Hi,
My solution was based on the input dataset you showed.  If xy at 12_g.com is "xy12_g at gmail.com" (or both of them exist in the dataset??  Not clear!)., then try:

dat <- read.table(text="Emails      
Mal123 at gmail.com
Mahi.r at gmail.com
xyz at gmail.com 
Ravi_123 at yahoo.com
Lavk.lll at rediff.com
xy12_g at gmail.com",sep="",header=TRUE,stringsAsFactors=FALSE)
library(stringr)
vec1 <- dat$Emails
vec2 <- gsub("\\.[[:alnum:]]+$","",gsub("^([[:alpha:]]+)(\\d+.*)","\\1_\\2",vec1))
indx <- which(str_count(vec2,"\\_")>1)
vec2[indx] <- str_replace(vec2[indx],"_","*")
indx1 <- setdiff(grep("[[:punct:]]+",gsub("\\@.*","",vec2)),indx)
res <- setNames(cbind(dat,do.call(rbind,lapply(seq_along(vec2),function(i) if(i %in% indx1){strsplit(vec2[i],"[_ at .]")[[1]]} else if(i %in% indx){strsplit(vec2[i],"[*@]")[[1]]} else strsplit(gsub("(.*)(\\@.*)","\\1*\\2",vec2[i]),"[*@]")[[1]]))),c("Emails","f.name","l.name","domain"))
 res[sapply(res,is.factor)] <- lapply(res[sapply(res,is.factor)],as.character)
res


A.K.


On Tuesday, February 11, 2014 5:31 AM, Malyadri Putchakayala <malyadri.putchakayala at nuevora.com> wrote:


HI,
           Emails        f.name l.name domain
#1    Mal123 at gmail.com    Mal    123  Gmail
#2    Mahi.r at gmail.com   Mahi      r  Gmail
#3      xyz at gmail.com    xyz         Gmail
#4  Ravi_123 at yahoo.com   Ravi    123  yahoo
#5 Lavk.lll at rediff.com   Lavk    lll rediff
#6    xy at 12_g.com     xy          12_g

ABOVE IS ALL ARE RIGHT.BUT MY REQUIREMENT IS 12_G IS ALSO LAST NAME 
           Emails         f.name l.name domain
#1    Mal123 at gmail.com    Mal    123    Gmail
#2    Mahi.r at gmail.com   Mahi      r    Gmail
#3      xyz at gmail.com    xyz           Gmail
#4  Ravi_123 at yahoo.com   Ravi    123    yahoo
#5 Lavk.lll at rediff.com   Lavk    lll    rediff
#6 xy12_g at gmail.com       xy    12_g    Gmail  


MY FINAL OUTPUT IS THIS TYPE.IF POSSIBLE PLEASE HELP




More information about the R-help mailing list