[R] problem with pattern matching

William Dunlap wdunlap at tibco.com
Wed Aug 5 18:48:06 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of jim holtman
> Sent: Wednesday, August 05, 2009 5:23 AM
> To: Rnewbie
> Cc: r-help at r-project.org
> Subject: Re: [R] problem with pattern matching
> 
> I think you want to use either 'match' or '%in%'
> 
> x <- dataframe$ID %in% list$ID  # TRUE if it is in list

grep(pattern,text) expects that pattern is a scalar string.
It, like quite a few other R functions, will not alert you if
you pass it several strings: it silently ignores all but the
first.  S+'s grep() will throw an errorg if length(pattern)!=0.
E.g.,

RS> grep(pattern=c("a+", "b+"), c("cat","dog","bear"), value=TRUE)
S+: Problem in regexpr(pattern, text): pattern should be a single character string, length is 2
S+: Use traceback() to see the call stack
R : [1] "cat"  "bear"

I think it would be better if this error were caught at runtime.
R does catch the 0-length argument, but gives a pretty generic
error message (perhaps to make translations easier):

RS> grep(pattern=character(), c("cat","dog","beet"))
S+: Problem in regexpr(pattern, text): pattern should be a single character string, length is 0
S+: Use traceback() to see the call stack
R : Error in grep(pattern = character(), c("cat", "dog", "beet")) :
R :   invalid argument

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

> 
> On Wed, Aug 5, 2009 at 5:36 AM, Rnewbie<xuancj at yahoo.com> wrote:
> >
> > I wanted to extract my interested rows from a dataframe. I used:
> >
> > grep(list$ID, dataframe$ID, value=T) #list contains a list 
> of my interested
> > IDs
> >
> > I got one match in return, which is the very first ID in 
> list. It seems the
> > matching process just stopped, once the first match was found.
> >
> >
> >
> > David Winsemius wrote:
> >>
> >>
> >> On Aug 4, 2009, at 11:16 AM, Rnewbie wrote:
> >>
> >>>
> >>> dear all,
> >>>
> >>> I got a problem with pattern matching using grep. I 
> extracted a list
> >>> of
> >>> characters from a data frame, and I tried to match this list of
> >>> characters
> >>> to a column from another data frame. In return, I got only one
> >>> match, but
> >>> there should be far more matches. Any ideas what has gone wrong?
> >>
> >> In general this falls into the category of  a request to "read my
> >> mind". One, out of probably an infinite number, of ways to 
> get such a
> >> result is to use if()  when you needed ifelse().
> >>
> >>>
> >>> Another question, if I also want to match the whole of 
> the elements
> >>> against
> >>> the non-initial parts of the elements in another table. Which
> >>> command should
> >>> I use?
> >>
> >> Cannot even assign a semantic meaning to that one. What is 
> are "non-
> >> initial parts of the elements of another table"?
> >>
> >>
> >> ******************************************************************
> >>>  .... provide commented, minimal, self-contained, 
> reproducible code.
> >> ******************************************************************
> >>>
> >>> Thanks
> >>
> >> David Winsemius, MD
> >> Heritage Laboratories
> >> West Hartford, CT
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >
> > --
> > View this message in context: 
> http://www.nabble.com/problem-with-pattern-matching-tp24810298
> p24823683.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list