[R] Odp: Successively eliminating most frequent elemets
Petr PIKAL
petr.pikal at precheza.cz
Thu Aug 9 09:51:30 CEST 2007
Hi
your construction is quite complicated so instead of refining it I tried
to do such task different way. If I understand what you want to do you can
use
> set.seed(1)
> T <- matrix(trunc(runif(20)*10), nrow=10, ncol=2)
> T
[,1] [,2]
[1,] 2 2
[2,] 3 1
[3,] 5 6
[4,] 9 3
[5,] 2 7
[6,] 8 4
[7,] 9 7
[8,] 6 9
[9,] 6 3
[10,] 0 7
> m<-table(T) # matrix is vector with dimensions
> todel<-rowSums(T==as.numeric(names(which.max(m))))>0 # check which
element of matrix is the first frequent
> T[todel,]
[,1] [,2]
[1,] 2 2
[2,] 2 7
> T[!todel,]
[,1] [,2]
[1,] 3 1
[2,] 5 6
[3,] 9 3
[4,] 8 4
[5,] 9 7
[6,] 6 9
[7,] 6 3
[8,] 0 7
You can put all of these to cycle but you have to decide when to end the
cycle.
Regards
Petr
petr.pikal at precheza.cz
r-help-bounces at stat.math.ethz.ch napsal dne 08.08.2007 15:33:58:
> Dear experts,
>
> I have a 10x2 matrix T containing random integers. I would like to
delete
> pairs (rows) iteratively, which contain the most frequent element either
in
> the first or second column:
>
>
> T <- matrix(trunc(runif(20)*10), nrow=10, ncol=2)
>
> G <- matrix(0, nrow=6, ncol=2)
>
> for (i in (1:6)){
> print("****************** Start iteration " ~i~ "
*******************")
> print("Current matrix:")
> print(T)
>
> m <- append(T[,1], T[,2])
>
> print("Concatenated columns:")
> print(m)
>
>
> # build frequency table
> F <- data.matrix(as.data.frame(table(m)))
>
> dimnames(F)<-NULL
>
> # pick up the most frequent element: sort decreasing and take is from
the top
> F <- F[order(F[,2], decreasing=TRUE),]
>
> print("Freq. table:")
> print(F[1:5,])
>
> todel <- F[1,1] #rows containing the most frequent element will be
deleted
> G[i,1] <- todel
> G[i,2] <- F[1,2]
>
> print("todel="~todel)
>
> # eliminate rows containing the most frequent element
> # either the first or the second column contains this element
> id <- which(T[,1]==todel)
> print("Indexes of rows to be deleted:")
> print(id)
> if (length(id)>0){
> T <- T[-1*id, ]
> }
>
> id <- which(T[,2]==todel)
> print("Indexes of rows to be deleted:")
> print(id)
> if (length(id)>0){
> T <- T[-1*id, ]
> }
>
> print("nrow(T)="~nrow(T))
>
> }
>
> print("Result matrix:")
> print(G)
>
> The output of the first two iterations looks like as follows. As one can
see,
> the frequency table in the second iteration still contains the element
deleted
> in the first iteration! Is this a bug or what am I doing here wrong?
> Any help greatly appreciated!
>
> [1] "****************** Start iteration 1 *******************"
> [1] "Current matrix:"
> [,1] [,2]
> [1,] 2 2
> [2,] 6 7
> [3,] 9 9
> [4,] 3 5
> [5,] 4 0
> [6,] 7 9
> [7,] 5 7
> [8,] 1 7
> [9,] 9 6
> [10,] 3 3
> [1] "Concatenated columns:"
> [1] 2 6 9 3 4 7 5 1 9 3 2 7 9 5 0 9 7 7 6 3
> [1] "Freq. table:"
> [,1] [,2]
> [1,] 8 4
> [2,] 9 4
> [3,] 4 3
> [4,] 3 2
> [5,] 6 2
> [1] "todel=8"
> [1] "Indexes of rows to be deleted:"
> integer(0)
> [1] "Indexes of rows to be deleted:"
> integer(0)
> [1] "nrow(T)=10"
> [1] "****************** Start iteration 2 *******************"
> [1] "Current matrix:"
> [,1] [,2]
> [1,] 2 2
> [2,] 6 7
> [3,] 9 9
> [4,] 3 5
> [5,] 4 0
> [6,] 7 9
> [7,] 5 7
> [8,] 1 7
> [9,] 9 6
> [10,] 3 3
> [1] "Concatenated columns:"
> [1] 2 6 9 3 4 7 5 1 9 3 2 7 9 5 0 9 7 7 6 3
> [1] "Freq. table:"
> [,1] [,2]
> [1,] 8 4
> [2,] 9 4
> [3,] 4 3
> [4,] 3 2
> [5,] 6 2
> [1] "todel=8"
> [1] "Indexes of rows to be deleted:"
> integer(0)
> [1] "Indexes of rows to be deleted:"
> integer(0)
> [1] "nrow(T)=10"
> [1] "****************** Start iteration 3 *******************"
> [1] "Current matrix:"
> ...
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list