[R] Removing values containing a specific character

arun smartpink111 at yahoo.com
Sun Jan 27 18:46:01 CET 2013


Hi Yasha,

 I guess you got Uwe's response. 

 I created `df2` with the intention of getting the two results from the original dataset.
For example, after you get the first result
df[,1][grep("@",df$names)]<- "" 
#you can get the second result by:
df[df$names!="",]
 # names             emails
#1   bob       bobj at cup.com
#2   joe joesmith at gmail.com
#4 emily   emily2 at yahoo.com

#or
df[grep("\\w+",df$names),]
#  names             emails
#1   bob       bobj at cup.com
#2   joe joesmith at gmail.com
#4 emily   emily2 at yahoo.com

But, I am  not sure how this will work over a 5.5 million rows. 
A.K.




----- Original Message -----
From: ypodeswa <ypodeswa at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Sunday, January 27, 2013 1:11 AM
Subject: Re: [R] Removing values containing a specific character

Actually, it worked perfectly for my sample data, but my actual data has
5.5 million rows, and grep doesn't seem to work with over a million rows.
Any idea on a workaround?


On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa <ypodeswa at gmail.com> wrote:

> Awesome, thanks Arun, that's exactly what I was looking for!
>
>
> On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
> ml-node+s789695n4656749h63 at n4.nabble.com> wrote:
>
>> Hi,
>> Try this:
>> df[]<-lapply(df,as.character)
>> df2<-df
>> df[,1][grep("@",df$names)]<- ""
>> df
>>   #names             emails
>> #1   bob      bobj at cup.com
>> #2   joe joesmith at gmail.com
>> #3          craig at gmail.com
>> #4 emily  emily2 at yahoo.com
>> #5          jane at yahoo.com
>>
>> #2nd part:
>>
>>  df2[-grep("@",df2$names),]
>>   names             emails
>> #1   bob      bobj at cup.com
>> #2   joe joesmith at gmail.com
>> #4 emily  emily2 at yahoo.com
>> A.K.
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656749.html
>>  To unsubscribe from Removing values containing a specific character, click
>> here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4656744&code=eXBvZGVzd2FAZ21haWwuY29tfDQ2NTY3NDR8LTEyMTY0MzM4NDk=>
>> .
>> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>




--
View this message in context: http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656751.html
Sent from the R help mailing list archive at Nabble.com.
    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list