[R] calculations on columns with partially matching names

David Winsemius dwinsemius at comcast.net
Mon Jan 4 01:08:04 CET 2010


On Jan 3, 2010, at 6:09 PM, Jim Bouldin wrote:

>
> Is there a command for partial matching of character strings?  
> Specifically,
> I'd like to be able to calculate the mean of the values in any  
> columns in a
> data frame or matrix that have identity in part of their column  
> names.  For
> example, columns labeled "mpw06a" and "mpw06b" match on the first five
> characters; their mean would be taken whereas any columns beginning  
> with
> other than "mpw06" would be excluded.

?grep
?"["

 > tdf <- data.frame(mpw06a=rnorm(10), mpw06b=rnorm(10), abc=rnorm(10))

 > lapply(tdf[ , grep("mpw06", names(tdf)) ], mean)
$mpw06a
[1] -0.1825447

$mpw06b
[1] -0.2386772

> I need to compare every pair of
> columns in the frame, and in some cases, possibly three at a time.

?combn

>
> Thanks in advance for any ideas.




>
>
>
>
> Jim Bouldin
> Research Ecologist
> Department of Plant Sciences, UC Davis
> Davis CA, 95616
> 530-554-1740
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list