[R] Substring function?

Erik Iverson eriki at ccbr.umn.edu
Tue Jul 13 17:08:15 CEST 2010


The high-level concept you need is called "Regular Expressions".  R 
supports these through several functions, see ?regex .

Ralf B wrote:
> Hi all,
> 
> I would like to detect all strings in the vector 'content' that
> contain the strings from the vector 'search'. Here a code example:
> 
> content <- data.frame(urls=c(
> 					"http://www.google.com/search?source=ig&hl=en&rlz=&=&q=stuff&aq=f&aqi=g10&aql=&oq=&gs_rfai=CrrIS3",
> 					"http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stuff&toggle=1")
> )
> search <- data.frame(signatures=c("http://www.google.com/search"))
> subset(content, search$signatures %in% content$urls)
> 
> I am getting an error:
> 
> [1] urls
> <0 rows> (or 0-length row.names)
> 
> 
> What I would like to achieve is the return of
> "http://www.google.com/search?source=ig&hl=en&rlz=&=&q=stuff&aq=f&aqi=g10&aql=&oq=&gs_rfai=CrrIS3".
> Is that possible? In practice I would like to run this over 1000s of
> strings in 'content' and 100s of strings in 'search'. Could I run into
> performance issues with this approach and, if so, are there better
> ways?
> 
> Best,
> Ralf
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list