[R] collapse rows in a matrix

Tony Plate tplate at blackmesacapital.com
Fri Nov 15 23:02:49 CET 2002


At 12:35 PM 11/15/2002 -0500, you wrote:
>Hi, all,
>   I have a little problem to solve. I'd like to collapse rows which are next
>to each other but have
>  same value to one row. The following is an example.
>
>say x is a data frame like:
>    X1 X2 X3 X4 X5
>a  1  0  0  0  1
>b  1  0  1  0  1
>c  1  0  0  0  1
>d  1  0  0  0  1
>e  1  0  0  0  1
>f  1  1  0  0  1
>g  1  1  0  0  1
>
>notice that  a, c,d,e are the same. since c,d,e are next to each other, I
>will only use the middle
>one,i.e d.   I will also  keep a although it is the same as d.
>
>f,g , I will keep f or g.
>
>so the ideal output is:
>a  1  0  0  0  1
>b  1  0  1  0  1
>d  1  0  0  0  1
>f  1  1  0  0  1
>
>Any idea how to do it?  Thanks!!

I think the following will do what you seem to want (except that it keeps 
the first of duplicated rows, i.e., of rows "c","d" & "e", it keeps "c")

 > x <- read.table(header=T,file=stdin(),nrow=7)
   X1 X2 X3 X4 X5
a  1  0  0  0  1
b  1  0  1  0  1
c  1  0  0  0  1
d  1  0  0  0  1
e  1  0  0  0  1
f  1  1  0  0  1
g  1  1  0  0  1
 > x[which(c(1,apply(apply(x, 2, diff), 1, any))!=0),,drop=FALSE]
   X1 X2 X3 X4 X5
a  1  0  0  0  1
b  1  0  1  0  1
c  1  0  0  0  1
f  1  1  0  0  1
 >

This expression depends on any() returning TRUE if any of the values it is 
given are non-zero, which does seem to work for negative integers, and for 
any finite floating-point number.

However, the expression above doesn't work if there are NA values -- for it 
to work in the presence of NAs you need to replace "any" by something like 
"function(x) identical(all(x==0),TRUE)" (leaving the quotes off) (Are there 
any more elegant ways of expressing this?)

Hope this helps,

Tony Plate

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list