[R] concatenating range of columns in dataframe

Evan Cooch evan.cooch at gmail.com
Fri Mar 10 05:16:39 CET 2017


Suppose I have the following data frame (call it df):

Trt   y1  y2  y3  y4
A1A   1    0    0    1
A1B  1    1    0    0
A1 C   0   1    0   1
A1D   1    1    1   1

What I want to do is concatenate columns y1  -> y4 into a contiguous 
string (which I'll call df$conc), so that the final df looks like

Trt      Conc
A1A   1001
A1B   1100
A1C  0101
A1D   1111


Now, if my initial dataframe was simply

  1   0  0  1
  1   1  0  0
   0  1  0  1
   1  1  1  1

then apply(df,1,paste,collapse="") does the trick, more or less.

But once I have a Trt column, this approach yields

A1A1001
A1B1100
A1C0101
A1D1111

I need to maintain the space between Trt, and the other columns. So, I'm 
trying to concatenate a subset of columns in the data frame, but I don't 
want to have to do something like create a cahracter vector of the 
column names to do it (e.g., c("y1","y2","y3","y4"). Doing a few by hand 
that way is easy, but not if you  have dozens to hundreds of columns to 
work with.

  Ideally, I'd like to be able to say

"concatenate df[,2:4], get rid of the spaces, pipe the concatenated 
columns to a new named column, and drop the original columns from the 
final df.

Heuristically,

df$conc <- concatenate df[,2:4] # making a new, 5th column in df
df[,2:4] <- NULL   # to drop original columns 2 -> 4

Suggestions/pointers to the obvious appreciated.

Thanks in advance!



More information about the R-help mailing list