[R] Remove similar rows from matrix

PIKAL Petr petr.pikal at precheza.cz
Thu Aug 23 14:09:25 CEST 2012


Hi

I cannot reproduce exactly what you want but maybe you can elaborate this to suit your needs.

sel1<-rowSums(is.na(mat)) # number of NA values
sel2<-c(0,rowSums(apply(mat,2,diff)==0, na.rm=T)) # rows which are same

but first row is not considered same, therefore I add also the first row

sel<-c(rowSums(embed(sel2,2)),0)

and here I select only rows which are unique and do not have any NA
mat[(sel1*sel)==0,]

Which is not exactly what you want as one of rows starting  328 shall be included. So there has to be another trick but I can not come to any.

Regards
Petr

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Tonja Krueger
> Sent: Wednesday, August 22, 2012 10:16 AM
> To: r-help at r-project.org
> Subject: [R] Remove similar rows from matrix
> 
> 
>    Hi everybody,
> 
>    I have a matrix (mat) from which I want to remove all rows that
> differ from
>    other rows in that matrix only by having one ore two NA’s instead of
> a
>    numbers.
> 
>    I would like to remove rows with more NA’s preferably, so in the end
> the
>    matrix would look like mat2.
> 
>    Has someone done something similar before? Thanks for helping, Tonja
> 
> 
>    Here my example:
> 
>    ex <- c(14, 56, 114, 132, 187, 279, 324, 328, 328, 338, 338, 338,
> 346, 346,
>    395, 398, 428, 428, 428, 452, 452, 452, NA, 466, 467, 525, 894, 923,
> 968,
>    980, 1030, 1117, 1156, NA, 1159, 1166, 1166, 1166, 1171, 1171, 1209,
> 1211,
>    1235, 1235, 1235, 1275, 1275, 1275, NA, 1291, 1292, 1378, 829, 851,
> 880,
>    893, 929, 1003, 1042, 1045, 1045, 1051, 1051, 1051, 1057, 1057,
> 1097, 1099,
>    1119, 1119, 1119, 1147, 1147, 1147, 1147, 1167, 1168, 1235, 494,
> 510, 533,
>    538, 567, 623, 657, 660, 660, 666, 666, 666, 671, 671, 699, 702, NA,
> 722,
>    722, NA, NA, 744, 744, 759, 760, 816, 276, 293, 312, 318, 338, NA,
> NA, 418,
>    418, 424, 424, NA, 429, 429, NA, NA, 468, 468, 468, 490, 490, 490,
> 490, 508,
>    509, 568, 674, 696, 726, 734, 774, 851, 893, 896, 896, 903, 903,
> 903, 908,
>    908, 944, 947, 966, 966, 966, NA, 998, 998, 998, 1014, 1015, 1091,
> 421, 446,
>    472, 490, 510, 582, 624, 627, 627, 633, 633, NA, 640, 640, 669, 671,
> 685,
>    685, 685, 716, 716, 716, 716, 736, 737, 798, NA, NA, NA, NA, NA, NA,
> 74, NA,
>    NA, 82, NA, 82, 86, NA, 104, NA, 114, NA, 114, 119, 119, 119, 119,
> NA, NA,
>    NA)
> 
>    mat <- matrix(example, ncol=8)
> 
> 
>    ex2 <- c(14, 56, 114, 132, 187, 279, 324, 328, 338, 346, 395, 398,
> 428, 452,
>    466, 467, 525, 894, 923, 968, 980, 1030, 1117, 156, 1159, 1166,
> 1171, 1209,
>    1211, 1235, 1275, 1291, 1292, 1378, 829, 851, 880, 893, 929, 1003,
> 1042,
>    1045, 1051, 1057, 1097, 1099, 1119, 1147, 1167, 1168, 1235, 494,
> 510, 533,
>    538, 567, 623, 657, 660, 666, 671, 699, 702, 722, 744, 759, 760,
> 816, 276,
>    293, 312, 318, 338, NA, NA, 418, 424, 429, NA, NA, 468, 490, 508,
> 509, 568,
>    674, 696, 726, 734, 774, 851, 893, 896, 903, 908, 944, 947, 966,
> 998, 1014,
>    1015, 1091, 421, 446, 472, 490, 510, 582, 624, 627, 633, 640, 669,
> 671, 685,
>    716, 736, 737, 798, NA, NA, NA, NA, NA, NA, 74, NA, 82, 86, 104, NA,
> 114,
>    119, NA, NA, NA)
> 
>    mat2 <- matrix(example2, ncol=8)
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list