[R] How to rank matrix data by deciles?

vincent.deluard vincent.deluard at trimtabs.com
Fri May 14 01:31:06 CEST 2010




Dear Phil,

You helped me with a request to rand matrix columns by deciles two weeks
ago.

This really un-blocked me on this project but I found a little bug.

As in before, my data is in a matrix:

> madebt[1:16,1:2]
       X4.19.2010  X4.16.2010
 [1,] 26.61197531 26.58950617
 [2,]  5.72765432  5.73074074
 [3,]  5.95839506  5.96222222
 [4,]  5.64333333  5.64777778
 [5,] 20.93814815 20.95728395
 [6,]  0.00000000  0.00000000
 [7,]  0.07000000  0.07000000
 [8,] 12.87802469 12.86888889
 [9,]  3.64407407  3.64543210
[10,]  0.05037037  0.05049383
[11,] 25.59024691 25.60888889
[12,]  3.47987654  3.53246914
[13,]  0.00000000  0.00000000
[14,] 31.39037037 31.39049383
[15,]  3.78296296  3.77641975
[16,] 13.17876543 13.19617284

The apply function will work for this sample of my data:

debtdeciles = apply(madebt[1:16,1:2],2,function(x)
            cut(x,quantile(x,(0:10)/10,
na.rm=TRUE),label=FALSE,include.lowest=TRUE))

debtdeciles

     X4.19.2010 X4.16.2010
 [1,]         10         10
 [2,]          6          6
 [3,]          6          6
 [4,]          5          5
 [5,]          8          8
 [6,]          1          1
 [7,]          2          2
 [8,]          7          7
 [9,]          4          4
[10,]          2          2
[11,]          9          9
[12,]          3          3
[13,]          1          1
[14,]         10         10
[15,]          4          4
[16,]          8          8

However, it will fail for

> madebt[1:17,1:2]
       X4.19.2010  X4.16.2010
 [1,] 26.61197531 26.58950617
 [2,]  5.72765432  5.73074074
 [3,]  5.95839506  5.96222222
 [4,]  5.64333333  5.64777778
 [5,] 20.93814815 20.95728395
 [6,]  0.00000000  0.00000000
 [7,]  0.07000000  0.07000000
 [8,] 12.87802469 12.86888889
 [9,]  3.64407407  3.64543210
[10,]  0.05037037  0.05049383
[11,] 25.59024691 25.60888889
[12,]  3.47987654  3.53246914
[13,]  0.00000000  0.00000000
[14,] 31.39037037 31.39049383
[15,]  3.78296296  3.77641975
[16,] 13.17876543 13.19617284
[17,]  0.00000000  0.00000000


> debtdeciles = apply(madebt[1:17,1:2],2,function(x)
+             cut(x,quantile(x,(0:10)/10,
na.rm=TRUE),label=FALSE,include.lowest=TRUE))
Error in cut.default(x, quantile(x, (0:10)/10, na.rm = TRUE), label = FALSE, 
: 
  'breaks' are not unique 

My guess is that we now have 3 "zeros" in each column. For each decile, we
cannot have more than 2 elements (total of 17 numbers in each column) and I
believe R cannot determine where to put the third "zero". Do you have any
solution for this problem?

Many thanks,

-- 
View this message in context: http://r.789695.n4.nabble.com/How-to-rank-matrix-data-by-deciles-tp2133496p2215945.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list