[R] selecting dataframe columns based on substring of col name(s)

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Wed Jun 21 20:54:03 CEST 2017


d[ , paste( "col", 2:4 ) ]

or

d[ , sprintf( "col%d", 2:4 ) ]

or

d[ , grep( "^col[234]$", names( d ) ]

Each approach has different ways of being flexible.
-- 
Sent from my phone. Please excuse my brevity.

On June 21, 2017 9:11:10 AM PDT, Evan Cooch <evan.cooch at gmail.com> wrote:
>Suppose I have the following sort of dataframe, where each column name 
>has a common structure: prefix, followed by a number (for this example,
>
>col1, col2, col3 and col4):
>
>  d = data.frame( col1=runif(10), col2=runif(10), 
>col3=runif(10),col4=runif(10))
>
>What I haven't been able to suss out is how to efficiently 
>'extract/manipulate/play with' columns from the data frame, making use 
>of this common structure.
>
>Suppose, for example, I want to 'work with' col2, col3, and col4. Now,
>I 
>could subset the dataframe d in any number of ways -- for example
>
>piece <- d[,c("col2","col3","col4")]
>
>Works as expected, but for *big* problems (where I might have dozens ->
>
>hundreds of columns -- often the case with big design matrices output
>by 
>some linear models program or another), having to write them all out 
>using c("col2","col3",...."colXXXXX") takes a lot of time. What I'm 
>wondering about is if there is a way to simply select over the
>"changing 
>part" of the column name (you can do this relatively easily in a data 
>step in SAS, for example). Heuristically, something like:
>
>piece <- df[,col2:col4]
>
>where the heuristic col2:col4 is interpreted as col2 -> col4 (parse the
>
>prefix 'col', and then simply select over the changing suffic -- i.e., 
>column number).
>
>Now, if I use the "to" function in the lessR package, I can get there 
>from here fairly easily:
>
>piece <- d[,to("col",4,from=2,same.size=FALSE)]
>
>But, is there a better way? Beyond 'efficiency' (ease of 
>implementation), part of what constitutes 'better' might be something
>in 
>base R, rather than relying on a package?
>
>Thanks in advance...
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list