[R] How to rank matrix data by deciles?
Phil Spector
spector at stat.berkeley.edu
Fri May 14 01:50:52 CEST 2010
Vincent -
I'm afraid there's no solution other than artificially modifying
the zeroes:
> vec
[1] 26.58950617 5.73074074 5.96222222 5.64777778 20.95728395 0.00000000 0.07000000 12.86888889
[9] 3.64543210 0.05049383 25.60888889 3.53246914 0.00000000 31.39049383 3.77641975 13.19617284
[17] 0.00000000
> cut(vec,quantile(vec,(0:10)/10),include.lowest=TRUE,label=FALSE)
Error in cut.default(vec, quantile(vec, (0:10)/10), include.lowest = TRUE, :
'breaks' are not unique
> vec[vec==0] = jitter(vec[vec==0])
> cut(vec,quantile(vec,(0:10)/10),include.lowest=TRUE,label=FALSE)
[1] 10 6 7 5 9 1 3 7 4 2 9 4 2 10 5 8 1
It gives an answer, but it may not make sense for all data.
- Phil
On Thu, 13 May 2010, vincent.deluard wrote:
>
>
>
> Dear Phil,
>
> You helped me with a request to rand matrix columns by deciles two weeks
> ago.
>
> This really un-blocked me on this project but I found a little bug.
>
> As in before, my data is in a matrix:
>
>> madebt[1:16,1:2]
> X4.19.2010 X4.16.2010
> [1,] 26.61197531 26.58950617
> [2,] 5.72765432 5.73074074
> [3,] 5.95839506 5.96222222
> [4,] 5.64333333 5.64777778
> [5,] 20.93814815 20.95728395
> [6,] 0.00000000 0.00000000
> [7,] 0.07000000 0.07000000
> [8,] 12.87802469 12.86888889
> [9,] 3.64407407 3.64543210
> [10,] 0.05037037 0.05049383
> [11,] 25.59024691 25.60888889
> [12,] 3.47987654 3.53246914
> [13,] 0.00000000 0.00000000
> [14,] 31.39037037 31.39049383
> [15,] 3.78296296 3.77641975
> [16,] 13.17876543 13.19617284
>
> The apply function will work for this sample of my data:
>
> debtdeciles = apply(madebt[1:16,1:2],2,function(x)
> cut(x,quantile(x,(0:10)/10,
> na.rm=TRUE),label=FALSE,include.lowest=TRUE))
>
> debtdeciles
>
> X4.19.2010 X4.16.2010
> [1,] 10 10
> [2,] 6 6
> [3,] 6 6
> [4,] 5 5
> [5,] 8 8
> [6,] 1 1
> [7,] 2 2
> [8,] 7 7
> [9,] 4 4
> [10,] 2 2
> [11,] 9 9
> [12,] 3 3
> [13,] 1 1
> [14,] 10 10
> [15,] 4 4
> [16,] 8 8
>
> However, it will fail for
>
>> madebt[1:17,1:2]
> X4.19.2010 X4.16.2010
> [1,] 26.61197531 26.58950617
> [2,] 5.72765432 5.73074074
> [3,] 5.95839506 5.96222222
> [4,] 5.64333333 5.64777778
> [5,] 20.93814815 20.95728395
> [6,] 0.00000000 0.00000000
> [7,] 0.07000000 0.07000000
> [8,] 12.87802469 12.86888889
> [9,] 3.64407407 3.64543210
> [10,] 0.05037037 0.05049383
> [11,] 25.59024691 25.60888889
> [12,] 3.47987654 3.53246914
> [13,] 0.00000000 0.00000000
> [14,] 31.39037037 31.39049383
> [15,] 3.78296296 3.77641975
> [16,] 13.17876543 13.19617284
> [17,] 0.00000000 0.00000000
>
>
>> debtdeciles = apply(madebt[1:17,1:2],2,function(x)
> + cut(x,quantile(x,(0:10)/10,
> na.rm=TRUE),label=FALSE,include.lowest=TRUE))
> Error in cut.default(x, quantile(x, (0:10)/10, na.rm = TRUE), label = FALSE,
> :
> 'breaks' are not unique
>
> My guess is that we now have 3 "zeros" in each column. For each decile, we
> cannot have more than 2 elements (total of 17 numbers in each column) and I
> believe R cannot determine where to put the third "zero". Do you have any
> solution for this problem?
>
> Many thanks,
>
> --
> View this message in context: http://r.789695.n4.nabble.com/How-to-rank-matrix-data-by-deciles-tp2133496p2215945.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list