[R] Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

David Winsemius dwinsemius at comcast.net
Mon Apr 23 18:16:39 CEST 2012


On Apr 23, 2012, at 12:10 PM, Paul Miller wrote:

> Hello All,
>
> Started out awhile ago trying to select columns in a dataframe whose  
> names contain some variation of the word "mutant" using code like:
>
> names(KRASyn)[grep("muta", names(KRASyn))]
>
> The idea then would be to add together the various columns using  
> code like:
>
> KRASyn$Mutant_comb <- rowSums(KRASyn[grep("muta", names(KRASyn))])
>
> What I discovered though, is that this selects columns like  
> "nonmutated" and "unmutated" as well as columns like "mutated",  
> "mutation", and "mutational".
>
> So I'd like to know how to select columns that have some variation  
> of the word "mutant" without the "non" or the "un". I've been  
> looking around for an example of how to do that but haven't found  
> anything yet.
>
> Can anyone show me how to select the columns I need?

If you want only columns whose names _begin_ with "muta" then add the  
"^" character at the beginning of your pattern:

names(KRASyn)[grep("^muta", names(KRASyn))]

(This should be explained on the ?regex page.)

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list