[R] how to use AND in grepl

chalabi.elahe at yahoo.de chalabi.elahe at yahoo.de
Mon May 2 20:01:32 CEST 2016


I just changed all the names in Command to lowercase, then this str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have any idea how I can bring PDT2  also in str_extract?  


On Monday, May 2, 2016 9:16 AM, Tom Wright <tom at maladmin.com> wrote:



The first thing I notice here is that your first two subset statements are searching in an object named Command, not the column df$Command. I'm not at all sure what you are trying to achieve with the str_extract process but it is looking for the exact string 'PDT2' the vectors / dataframe formed in your previous commands are not being used at all. 
Moving forward I think you need to pay attention to case "PD" != "pd". Also the set PDT2 is going to be a subset of both  sets PD and t2, I don't think this is what you are after.

On Mon, May 2, 2016, 8:49 AM  <chalabi.elahe at yahoo.de> wrote:

Yes it works, but let me explain what I am going to do. I extract all the names I want and then create a new column out of them for my plot. This is he whole thing I do:
>  PD=subset(df,grepl("pd",Command)) //extract names in Command with only "pd"
>  t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
>  PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract names which contain both "pd" and "t2"
>  v1=c('PD','t2','PDT2')// I create a vector with these conditions
>  str_extract(df$Command,paste(v1,collaps='|')) //returning patterns, using stringr library
>
>here I see no pattern named PDT2 but there are only PD and t2 patterns.
>On Monday, May 2, 2016 8:18 AM, Tom Wright <tom at maladmin.com> wrote:
>
>
>
>Sorry for the missed braces earlier. I was typing on a phone, not the best place to conjugate regular expressions.
>Using the example you provided:
>
>> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2", "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
>
>> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
>[1] FALSE FALSE FALSE FALSE  TRUE
>
>> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
>              Command
>5 pd_local_abdomen_t2
>
>
>
>On Mon, May 2, 2016 at 7:42 AM, <chalabi.elahe at yahoo.de> wrote:
>
>Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.
>>
>>
>>
>>On Monday, May 2, 2016 5:05 AM, peter dalgaard <pdalgd at gmail.com> wrote:
>>
>>On 02 May 2016, at 12:43 , ch.elahe via R-help <r-help at r-project.org> wrote:
>>
>>> Thanks for your reply tom. After using  Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
>>
>>Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
>>
>>
>>-pd
>>
>>> Thanks,
>>> Elahe
>>>
>>>
>>> On Saturday, April 30, 2016 7:35 PM, Tom Wright <tom at maladmin.com> wrote:
>>>
>>>
>>>
>>> Actually not sure my previous answer does what you wanted. Using your approach:
>>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
>>> Should work.
>>> I think the regex pattern you are looking for is:
>>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>>>
>>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <tom at maladmin.com> wrote:
>>>
>>> subset(df,grepl("t2|pd",x$Command))
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <r-help at r-project.org> wrote:
>>>>
>>>> Hi all,
>>>>>
>>>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>>>
>>>>> 'data.frame': 36919 obs. of 162 variables
>>>>>  $TE                :int 38,41,11,52,48,75,.....
>>>>>  $TR                :int 100,210,548,546,.....
>>>>>  $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>>>
>>>>> I have tried this but I did not get result:
>>>>>
>>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>>>
>>>>>
>>>>> does anyone know how to apply AND in grepl?
>>>>>
>>>>> Thanks
>>>>> Elahe
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>> .
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>--
>>Peter Dalgaard, Professor,
>>Center for Statistics, Copenhagen Business School
>>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>Phone: (+45)38153501
>>Office: A 4.23
>>Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>>



More information about the R-help mailing list