[R] Generate random numbers up to one

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Tue Mar 6 19:10:58 CET 2007


On 06-Mar-07 Petr Klasterecky wrote:
> Barry Rowlingson napsal(a):
>> Petr Klasterecky wrote:
>> 
>>> You need to specify what 'random' means. If you have any numbers,
>>> you can always make them add-up to 1:
>>> x <- rnorm(100) #runif(100), rpois(100) etc.
>>> x <- x/sum(x)
>>> sum(x)
>> 
>>  I see a slight problem that may occur with dividing by sum(x) in 
>> certain cases....
>> 
>> Barry
>> 
> 
> OK, dividing by 0 is not nice, but the original question was very 
> general and I wanted to give some minimal advice at least. However,
> I see a more serious issue I forgot to mention. So just to make it
> clear: sum(x) is a random variable as well and dividing by sum(x)
> does not preserve the original distribution data were generated from.
> 
> Petr

And, specifically (to take just 2 RVs X and Y), while U = X/(X+Y)
and V = Y/(A+Y) are two RVs which summ to 1, the distribution of U
is not the same as the distribution of X conditional on (X+Y = 1).

So X|(X+Y = 1) and Y|(X+Y = 1) are also two RVs which add up to 1,
but with different distributions from U = X/(X+Y) and V = Y/(X+Y).

Example:
Suppose X and Y are independent and uniformly distributed on (0,1).

It is quite easy to see that, conditional on (X+Y = 1), X is
uniformly distributed on (0,1). Similarly, so is Y.

The distribution of U = X/(X+Y) is a bit trickier to derive, but
the outcome is that, for (0 < u < 1/2):

  P(X/(X+Y) <= u) = u/(2*(1-u))

so U has density 1/(2*(1-u)^2) over 0 < u < 1/2. Similarly, U has
density 1/(2*(u^2)) over 1/2 < u < 1.

It is entertaining to verify this using R:

A: The uniform distribution of X|(X+Y=1)
   (condition approximated by 0.99 < X+Y < 1.01)

  X<-runif(10000); Y<-runif(10000); W<-(X+Y);
  ix<-((W > 0.99)&(W < 1.01)); U<-X[ix]

  while(length(U) < 10000){
    X<-runif(10000); Y<-runif(10000); W<-(X+Y);
    ix<-((W>=0.99)&(W<1.01)); U<-c(U,X[ix])
  }
  hist(U,breaks=100)


B: The non-uniform (see above) distribution of X/(X+Y)

  X<-runif(10000); Y<-runif(10000); U<-X/(X+Y)
  hist(U,breaks=100)
  u<-0.01*(0:50); fu<-1/(2*(1-u)^2)
  lines(u,100*fu)
  u<-0.01*(50:100); fu<-1/(2*(u)^2)
  lines(u,100*fu)

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-Mar-07                                       Time: 18:10:54
------------------------------ XFMail ------------------------------



More information about the R-help mailing list