[R] string pattern matching

Joe Ceradini joeceradini at gmail.com
Thu Mar 23 14:37:24 CET 2017


Thanks for the additional response, Bill. I did not want to bog down
the question with the full context of the function. Briefly, given a
set of nested and non-nested regression models, I want to compare AIC
(bigger model - smaller model) and do an LRT for all the nested models
that differ by a single predictor. All models, nested or not, would
also have an AIC value (I am aware of the critiques of mixing p-value
hypothesis testing and information criteria). So, not quite
MuMIn::dredge. The tricky part, for me, has been doing the comparisons
for only the nested models in a set that contains nested and
non-nested. I made some progress with the function, so I'll refrain
from bugging the list with the whole thing unless (when) I'm stuck
again.

For those interested in the motivation, I'm running with the idea of
trying to flag uninformative parameters which "steal" AIC model
weight, and potentially result in a misleading model set, depending
how the reader interprets the set.
Arnold, T. W. 2010. Uninformative parameters and model selection using
Akaike’s information criterion. Journal of Wildlife Management
74:1175–1178.
Murtaugh, P. 2014. In defense of P values. Ecology 95:611–617.

Joe

On Wed, Mar 22, 2017 at 9:11 AM, William Dunlap <wdunlap at tibco.com> wrote:
> You did not describe the goal of your pattern matching.  Were you trying
> to match any string that could be interpreted as an R expression containing
> X1 and X3 as additive terms?   If so, you could turn the string into a one-sided
> formula and use the terms() function.  E.g.,
>
> f <- function(string) {
>     fmla <- as.formula(paste("~", string))
>     term.labels <- attr(terms(fmla), "term.labels")
>     all(c("X1","X3") %in% term.labels)
> }
>
>> f("X3 + X2 + X1")
> [1] TRUE
>> f("- X3 + X2 + X1")
> [1] FALSE
>> f("X3 + X2 + log(X1)")
> [1] FALSE
>> f("X3 + X2 + log(X1) + X1")
> [1] TRUE
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Mar 22, 2017 at 6:39 AM, Joe Ceradini <joeceradini at gmail.com> wrote:
>> Wow. Thanks to everyone (Jim, Ng Bo Lin, Bert, David, and Ulrik) for
>> all the quick and helpful responses. They have given me a better
>> understanding of regular expressions, and certainly answered my
>> question.
>>
>> Joe
>>
>> On Wed, Mar 22, 2017 at 12:22 AM, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
>>> Hi Joe,
>>>
>>> you could also rethink your pattern:
>>>
>>> grep("x1 \\+ x2", test, value = TRUE)
>>>
>>> grep("x1 \\+ x", test, value = TRUE)
>>>
>>> grep("x1 \\+ x[0-9]", test, value = TRUE)
>>>
>>> HTH
>>> Ulrik
>>>
>>> On Wed, 22 Mar 2017 at 02:10 Jim Lemon <drjimlemon at gmail.com> wrote:
>>>>
>>>> Hi Joe,
>>>> This may help you:
>>>>
>>>> test <- c("x1", "x2", "x3", "x1 + x2 + x3")
>>>> multigrep<-function(x1,x2) {
>>>>  xbits<-unlist(strsplit(x1," "))
>>>>  nbits<-length(xbits)
>>>>  xans<-rep(FALSE,nbits)
>>>>  for(i in 1:nbits) if(length(grep(xbits[i],x2))) xans[i]<-TRUE
>>>>  return(all(xans))
>>>> }
>>>> multigrep("x1 + x3","x1 + x2 + x3")
>>>> [1] TRUE
>>>> multigrep("x1 + x4","x1 + x2 + x3")
>>>> [1] FALSE
>>>>
>>>> Jim
>>>>
>>>> On Wed, Mar 22, 2017 at 10:50 AM, Joe Ceradini <joeceradini at gmail.com>
>>>> wrote:
>>>> > Hi Folks,
>>>> >
>>>> > Is there a way to find "x1 + x2 + x3" given "x1 + x3" as the pattern?
>>>> > Or is that a ridiculous question, since I'm trying to find something
>>>> > based on a pattern that doesn't exist?
>>>> >
>>>> > test <- c("x1", "x2", "x3", "x1 + x2 + x3")
>>>> > test
>>>> > [1] "x1"           "x2"           "x3"           "x1 + x2 + x3"
>>>> >
>>>> > grep("x1 + x2", test, fixed=TRUE, value = TRUE)
>>>> > [1] "x1 + x2 + x3"
>>>> >
>>>> >
>>>> > But what if only have "x1 + x3" as the pattern and still want to
>>>> > return "x1 + x2 + x3"?
>>>> >
>>>> > grep("x1 + x3", test, fixed=TRUE, value = TRUE)
>>>> > character(0)
>>>> >
>>>> > I'm sure this looks like an odd question. I'm trying to build a
>>>> > function and stuck on this. Rather than dropping the whole function on
>>>> > the list, I thought I'd try one piece I needed help with...although I
>>>> > suspect that this question itself probably does bode well for my
>>>> > function :)
>>>> >
>>>> > Thanks!
>>>> > Joe
>>>> >
>>>> > ______________________________________________
>>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>>> > PLEASE do read the posting guide
>>>> > http://www.R-project.org/posting-guide.html
>>>> > and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>> Cooperative Fish and Wildlife Research Unit
>> Zoology and Physiology Dept.
>> University of Wyoming
>> JoeCeradini at gmail.com / 914.707.8506
>> wyocoopunit.org
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



-- 
Cooperative Fish and Wildlife Research Unit
Zoology and Physiology Dept.
University of Wyoming
JoeCeradini at gmail.com / 914.707.8506
wyocoopunit.org



More information about the R-help mailing list