[R] problem with "unique" function
peter dalgaard
pdalgd at gmail.com
Fri Jul 28 18:15:18 CEST 2017
Most likely, previous computations have ended up giving slightly different values of say 0.13333. A pragmatic way out is to round to, say, 5 digits before applying unique. In this particular case, it seems like all numbers are multiples of 1/30, so another idea could be to multiply by 30, round, and divide by 30.
-pd
> On 28 Jul 2017, at 17:17 , li li <hannah.hlx at gmail.com> wrote:
>
> I have the joint distribution of three discrete random variables z1, z2 and
> z3 which is captured by "z"
> and "prob" as described below.
>
> For example, the probability for z1=0.46667, z2=-1 and z3=-1 is 2.752e-13.
> Also, the probability adds up to 1.
>
>> head(z) z1 z2 z3
> [1,] -0.46667 -1.0000 -1.0000
> [2,] -0.33333 -0.9333 -0.9333
> [3,] -0.20000 -0.8667 -0.8667
> [4,] -0.06667 -0.8000 -0.8000
> [5,] 0.06667 -0.7333 -0.7333
> [6,] 0.20000 -0.6667 -0.6667> prob[1:5][1] 2.752e-13 3.210e-12
> 1.348e-11 2.656e-11 2.656e-11> sum(prob)[1] 1
>
>
> I want to put the distribution into a joint probability table. I use the
> following code. But the probability no longer adds up to 1.
>
>> z1 <- sort(unique(z[,1])); z2 <- sort(unique(z[,2])); z3 <- sort(unique(z[,3]))> P <- array(0, dim=c(length(z1),length(z2),length(z3)), dimnames=list(A=z1, H=z2, M=z3))> > for (i in 1:(dim(z)[1])){+ ind <- z[i,]+ P[dimnames(P)$A==ind[1], dimnames(P)$H==ind[2], dimnames(P)$M==ind[3]] <- prob[i] + }> sum(P)[1] 37.5
>
>
> The problem is when we look z1 as below, there are lot of repeated values.
>
>> unique(z[,1]) [1] -0.46667 -0.33333 -0.20000 -0.06667 0.06667 0.20000 0.33333 0.46667 -0.53333 -0.40000
> [11] -0.26667 -0.13333 0.00000 0.13333 0.26667 0.40000 0.53333
> -0.60000 -0.06667 0.06667
> [21] 0.60000 -0.66667 -0.13333 0.13333 0.66667 -0.73333 -0.60000
> -0.20000 0.20000 0.60000
> [31] 0.73333 -0.80000 -0.53333 -0.26667 0.26667 0.53333 0.80000
> -0.86667 -0.46667 -0.33333
> [41] 0.33333 0.46667 0.86667 -0.93333 -0.40000 0.40000 0.93333
> -1.00000 -0.46667 -0.33333
> [51] 0.33333 0.46667 1.00000 -0.53333 -0.26667 0.26667 0.53333
> -0.20000 0.20000 -0.66667
> [61] -0.13333 0.13333 0.66667 -0.73333 -0.06667 0.06667 0.73333
> -0.20000 0.20000 -0.13333
> [71] 0.13333 -0.06667 0.06667 -0.06667 0.06667
>
>
>
> Is there a way to fix this? Any idea and suggestions? Thanks very much!!
>
> Hanna
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list