[R] Consolidate column contents of equally "named" columns
David Winsemius
dwinsemius at comcast.net
Sun Apr 29 15:05:25 CEST 2012
On Apr 28, 2012, at 10:46 AM, Daniel Malter wrote:
> Hi,
>
> I have a data frame whose first row (not the header) contains the true
> column names. The same column name can occur multiple times in the
> dataset.
> Columns with equal names are not adjacent, and for each observation
> only one
> of the equally named columns contains the actual data (see the example
> below). I am looking for an easy method consolidate these columns
> into one
> column for each unique column name. Say,
>
> x1<-c("x",1,NA,NA)
> x2<-c("x",NA,2,NA)
> x3<-c("x",NA,NA,3)
> y1<-c("y",3,NA,NA)
> y2<-c("y",NA,1,NA)
> y3<-c("y",NA,NA,2)
> d<-data.frame(x1,y1,x2,y2,x3,y3)
> d
>
It would avoid problems with manipulating factors it these were
created (or converted to) character columns, choose one of:
d=data.frame(x1,y1,x2,y2,x3,y3, stringsAsFactors=FALSE)
d[]<-lapply(d, as.character)
> # d looks like:
>
> x1 y1 x2 y2 x3 y3
> 1 x y x y x y
> 2 1 3 <NA> <NA> <NA> <NA>
> 3 <NA> <NA> 2 1 <NA> <NA>
> 4 <NA> <NA> <NA> <NA> 3 2
>
>> From this, I want to create the table or data frame
>
> x y
> 1 3
> 2 1
> 3 2
na.omit(
data.frame(
X=stack(d[-1,grep("x", names(d))]),
Y=stack(d[-1,grep("y", names(d))]),
stringsAsFactors=FALSE)[ c(1,3) ])
X.values Y.values
1 1 3
5 2 1
9 3 2
If it were less regular you might need to merge with the "source"
columns that stack generates.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list