[R] Generate random vectors (continuous number) with a fixed sum
Bert Gunter
bgunter@4567 @end|ng |rom gm@||@com
Sat Apr 26 21:18:53 CEST 2025
Yes, thanks. I got my algebra/constraints wrong.
... Sigh...
Unfortunatey, this seems to make things yet more difficult.
-- Bert
"An educated person is one who can entertain new ideas, entertain others,
and entertain herself."
On Sat, Apr 26, 2025 at 10:01 AM Duncan Murdoch <murdoch.duncan using gmail.com>
wrote:
> On 2025-04-24 3:25 p.m., Bert Gunter wrote:
> > Folks:
> >
> > Unless my wee old brain errs (always a danger), uniform sampling from an
> > n-vector <xi> for which 0 <= ai <= xi <= bi and SUM(xi) = k, a constant,
> > where to ensure that the constraints are consistent (and nontrivial),
> > SUM(ai)< k and SUM(bi) > k, is a simple linear transformation (details
> left
> > to the reader) of uniform sampling from a standard n-1 dim simplex, for
> > which a web search yielded:
> >
> https://cs.stackexchange.com/questions/3227/uniform-sampling-from-a-simplex
>
> I think for n > 2 it is sometimes a simplex, and sometimes a different
> shape. For example, with n = 3 and all a_i = 0 and all b_i = 1, it's a
> simplex if k < 1 or k > 2, but not for 1 < k < 2, where it's a hexagon.
>
> You can visualize this with the following code:
>
> x <- matrix(runif(30000), ncol = 3) # the full cube
> x0 <- x[1.45 < sums & sums < 1.55,] # points near k = 1.5
> rgl::plot3d(x0)
>
>
> > The proposed "solutions" in this thread which rejection sample
> sequentially
> > from uniforms do *not* produce a uniform distribution on the simplex, as
> > the link and references therein explain.
>
> My solution didn't do rejection sampling, but it won't give a uniform
> density. I didn't realize that was requested.
>
> >
> > Apologies if I have misunderstood the problem and proposed solutions,
> > although there does seem to be some confusion in the thread about exactly
> > what was sought.
>
> Yes, that's true.
>
> Duncan Murdoch
>
> >
> > Cheers,
> > Bert
> >
> > "An educated person is one who can entertain new ideas, entertain others,
> > and entertain herself."
> >
> >
> >
> > On Thu, Apr 24, 2025 at 10:33 AM Brian Smith <briansmith199312 using gmail.com
> >
> > wrote:
> >
> >> Hi Rui,
> >>
> >> This code is able to generate absolutely correct random sample vector
> >> based on the applicable constraints.
> >>
> >> I have one question though.
> >>
> >> If I understood the R code correctly then, the first element is
> >> drawing random number from Uniform distribution unconditionally,
> >> however drawing of sample point for the second element is conditional
> >> to the first one. Therefore if we have large vector size instead of
> >> current 2, I guess the feasible region for the last few elements will
> >> be very small.
> >>
> >> Will that be any problem? does there any algorithm exist where all
> >> (n-1) elements would be drawn unconditionally assuming our vector has
> >> n elements?
> >>
> >> Thanks and regards,
> >>
> >>
> >>
> >> On Wed, 23 Apr 2025 at 10:57, Rui Barradas <ruipbarradas using sapo.pt>
> wrote:
> >>>
> >>> Hello,
> >>>
> >>> Here are your tests and the random numbers' histograms.
> >>>
> >>>
> >>> one_vec <- function(a, b, s) {
> >>> repeat{
> >>> repeat{
> >>> u <- runif(1, a[1], b[1])
> >>> if(s - u > 0) break
> >>> }
> >>> v <- s - u
> >>> if(a[2] < v && v < b[2]) break
> >>> }
> >>> c(u, v)
> >>> }
> >>> gen_mat <- function(m, a, b, s) {
> >>> replicate(m, one_vec(a, b, s))
> >>> }
> >>>
> >>> a <- c(0.015, 0.005)
> >>> b <- c(0.070, 0.045)
> >>> s <- 0.05528650577311
> >>> m <- 10000L
> >>>
> >>> set.seed(2025)
> >>> res <- gen_mat(m, a, b, s)
> >>> apply(res, 1, min) > a
> >>> #> [1] TRUE TRUE
> >>> apply(res, 1, max) < b
> >>> #> [1] TRUE TRUE
> >>>
> >>> # plot histograms of one million 2d vectors
> >>> system.time(
> >>> res1mil <- gen_mat(1e6, a, b, s)
> >>> )
> >>> #> user system elapsed
> >>> #> 3.01 0.06 3.86
> >>>
> >>> old_par <- par(mfrow = c(1, 2))
> >>> hist(res1mil[1L,])
> >>> hist(res1mil[2L,])
> >>> par(old_par)
> >>>
> >>>
> >>> Hope this helps,
> >>>
> >>> Rui Barradas
> >>>
> >>> Às 23:31 de 22/04/2025, Rui Barradas escreveu:
> >>>> Hello,
> >>>>
> >>>> Inline.
> >>>>
> >>>> Às 17:55 de 22/04/2025, Brian Smith escreveu:
> >>>>> i.e. we should have
> >>>>>
> >>>>> all elements of Reduce("+", res) should be equal to s =
> >> 0.05528650577311
> >>>>>
> >>>>> My assertion is that it is not happing here.
> >>>>
> >>>> You are right, that's not what is happening. The output is n vectors
> of
> >>>> 2 elements each. It's each of these vectors that add up to s.
> >>>> Appparently I misunderstood the problem.
> >>>>
> >>>> Maybe this is what you want?
> >>>> (There is no n argument, the matrix is always 2*m)
> >>>>
> >>>>
> >>>> one_vec <- function(a, b, s) {
> >>>> repeat{
> >>>> repeat{
> >>>> u <- runif(1, a[1], b[1])
> >>>> if(s - u > 0) break
> >>>> }
> >>>> v <- s - u
> >>>> if(a[2] < v && v < b[2]) break
> >>>> }
> >>>> c(u, v)
> >>>> }
> >>>> gen_mat <- function(m, a, b, s) {
> >>>> replicate(m, one_vec(a, b, s))
> >>>> }
> >>>>
> >>>> set.seed(2025)
> >>>> res <- gen_mat(10000, a, b, s)
> >>>> colSums(res)
> >>>>
> >>>>
> >>>> Hope this helps,
> >>>>
> >>>> Rui Barradas
> >>>>
> >>>>
> >>>>>
> >>>>>
> >>>>> On Tue, 22 Apr 2025 at 22:20, Brian Smith <
> briansmith199312 using gmail.com
> >>>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi Rui,
> >>>>>>
> >>>>>> Thanks for the explanation.
> >>>>>>
> >>>>>> But in this case, are we looking at the correct solution at all?
> >>>>>>
> >>>>>> My goal is to generate random vector where:
> >>>>>> 1) the first element is bounded by (a[1], b[1]) and second element
> is
> >>>>>> bounded by (a[2], b[2])
> >>>>>> 2) sum of the element is s
> >>>>>>
> >>>>>> According to the outcome,
> >>>>>> The first matrix values are bounded by c(a[1], b[1]) & second matrix
> >>>>>> values are bounded by c(a[2], b[2])
> >>>>>>
> >>>>>> But,
> >>>>>> regarding the sum, I think we should have sum (element-wise) sum
> >>>>>> should be equal to s = 0.05528650577311.
> >>>>>>
> >>>>>> How could we achieve that then?
> >>>>>>
> >>>>>> On Tue, 22 Apr 2025 at 22:03, Rui Barradas <ruipbarradas using sapo.pt>
> >> wrote:
> >>>>>>>
> >>>>>>> Às 12:39 de 22/04/2025, Brian Smith escreveu:
> >>>>>>>> Hi Rui,
> >>>>>>>>
> >>>>>>>> Many thanks for your time and insight.
> >>>>>>>>
> >>>>>>>> However, I am not sure if I could understand the code. Below is
> >> what I
> >>>>>>>> tried based on your code
> >>>>>>>>
> >>>>>>>> library(Surrogate)
> >>>>>>>> a <- c(0.015, 0.005)
> >>>>>>>> b <- c(0.070, 0.045)
> >>>>>>>> set.seed(2025)
> >>>>>>>> res <- mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
> >>>>>>>> MoreArgs = list(s = 0.05528650577311, n = 2, m =
> >>>>>>>> 10000), a, b)
> >>>>>>>>
> >>>>>>>> res1 = res[[1]]
> >>>>>>>> res2 = res[[2]]
> >>>>>>>>
> >>>>>>>> apply(res1, 1, min) > a ## [1] TRUE TRUE
> >>>>>>>> apply(res2, 1, min) > a ## [1] FALSE TRUE
> >>>>>>>>
> >>>>>>>> I could not understand what basically 2 blocks of res signify?
> >> Which
> >>>>>>>> one I should take as final simulation of the vector? If I take the
> >>>>>>>> first block then the lower bound condition is fulfilled, but not
> >> with
> >>>>>>>> the second block. However with the both blocks, the total equals s
> >> is
> >>>>>>>> satisfying.
> >>>>>>>>
> >>>>>>>> I appreciate your further insight.
> >>>>>>>>
> >>>>>>>> Thanks and regards,
> >>>>>>>>
> >>>>>>>> On Mon, 21 Apr 2025 at 20:43, Rui Barradas <ruipbarradas using sapo.pt>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> Inline.
> >>>>>>>>>
> >>>>>>>>> Às 16:08 de 21/04/2025, Rui Barradas escreveu:
> >>>>>>>>>> Às 15:27 de 21/04/2025, Brian Smith escreveu:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> There is a function called RandVec in the package Surrogate
> >>>>>>>>>>> which can
> >>>>>>>>>>> generate andom vectors (continuous number) with a fixed sum
> >>>>>>>>>>>
> >>>>>>>>>>> The help page of this function states that:
> >>>>>>>>>>>
> >>>>>>>>>>> a
> >>>>>>>>>>>
> >>>>>>>>>>> The function RandVec generates an n by m matrix x. Each of the
> m
> >>>>>>>>>>> columns contain n random values lying in the interval [a,b].
> The
> >>>>>>>>>>> argument a specifies the lower limit of the interval. Default
> 0.
> >>>>>>>>>>>
> >>>>>>>>>>> b
> >>>>>>>>>>>
> >>>>>>>>>>> The argument b specifies the upper limit of the interval.
> >>>>>>>>>>> Default 1.
> >>>>>>>>>>>
> >>>>>>>>>>> However in my case, the lower and upper limits are not same.
> For
> >>>>>>>>>>> example, if I need to draw a pair of number x, y, such that x +
> >>>>>>>>>>> y = 1,
> >>>>>>>>>>> then the lower and upper limits are different.
> >>>>>>>>>>>
> >>>>>>>>>>> I tried with below code
> >>>>>>>>>>>
> >>>>>>>>>>> library(Surrogate)
> >>>>>>>>>>>
> >>>>>>>>>>> RandVec(a=c(0.1, 0.2), b=c(0.2, 0.8), s=1, n=2,
> >> m=5)$RandVecOutput
> >>>>>>>>>>>
> >>>>>>>>>>> This generates error with message
> >>>>>>>>>>>
> >>>>>>>>>>> Error in if (b - a == 0) { : the condition has length > 1
> >>>>>>>>>>>
> >>>>>>>>>>> Is there any way to generate the numbers with different lower
> >> and
> >>>>>>>>>>> upper limits?
> >>>>>>>>>>>
> >>>>>>>>>>> ______________________________________________
> >>>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
> >> see
> >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>>>>>>> PLEASE do read the posting guide
> >> https://www.R-project.org/posting-
> >>>>>>>>>>> guide.html
> >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible
> >> code.
> >>>>>>>>>> Hello,
> >>>>>>>>>>
> >>>>>>>>>> Use ?mapply to cycle through all values of a and b.
> >>>>>>>>>> Note that the output matrices are transposed, the random vectors
> >>>>>>>>>> are the
> >>>>>>>>>> rows.
> >>>>>>>>> Sorry, this is not true. The columns are the random vectors, as
> >>>>>>>>> documented. An example setting the RNG seed, for reproducibility.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> library(Surrogate)
> >>>>>>>>>
> >>>>>>>>> a <- c(0.1, 0.2)
> >>>>>>>>> b <- c(0.2, 0.8)
> >>>>>>>>> set.seed(2025)
> >>>>>>>>> res <- mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
> >>>>>>>>> MoreArgs = list(s = 1, n = 2, m = 5), a, b)
> >>>>>>>>>
> >>>>>>>>> res
> >>>>>>>>> #> $RandVecOutput
> >>>>>>>>> #> [,1] [,2] [,3] [,4] [,5]
> >>>>>>>>> #> [1,] 0.146079 0.1649319 0.1413759 0.257086 0.1715478
> >>>>>>>>> #> [2,] 0.253921 0.2350681 0.2586241 0.142914 0.2284522
> >>>>>>>>> #>
> >>>>>>>>> #> $RandVecOutput
> >>>>>>>>> #> [,1] [,2] [,3] [,4] [,5]
> >>>>>>>>> #> [1,] 0.5930918 0.2154583 0.6915523 0.7167089 0.3617918
> >>>>>>>>> #> [2,] 0.4069082 0.7845417 0.3084477 0.2832911 0.6382082
> >>>>>>>>>
> >>>>>>>>> lapply(res, colSums)
> >>>>>>>>> #> $RandVecOutput
> >>>>>>>>> #> [1] 0.4 0.4 0.4 0.4 0.4
> >>>>>>>>> #>
> >>>>>>>>> #> $RandVecOutput
> >>>>>>>>> #> [1] 1 1 1 1 1
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hope this helps,
> >>>>>>>>>
> >>>>>>>>> Rui Barradas
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> library(Surrogate)
> >>>>>>>>>>
> >>>>>>>>>> a <- c(0.1, 0.2)
> >>>>>>>>>> b <- c(0.2, 0.8)
> >>>>>>>>>> mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
> >>>>>>>>>> MoreArgs = list(s = 1, n = 2, m = 5), a, b)
> >>>>>>>>>> #> $RandVecOutput
> >>>>>>>>>> #> [,1] [,2] [,3] [,4] [,5]
> >>>>>>>>>> #> [1,] 0.2004363 0.1552328 0.2391742 0.1744857 0.1949236
> >>>>>>>>>> #> [2,] 0.1995637 0.2447672 0.1608258 0.2255143 0.2050764
> >>>>>>>>>> #>
> >>>>>>>>>> #> $RandVecOutput
> >>>>>>>>>> #> [,1] [,2] [,3] [,4] [,5]
> >>>>>>>>>> #> [1,] 0.2157416 0.4691191 0.5067447 0.7749258 0.7728955
> >>>>>>>>>> #> [2,] 0.7842584 0.5308809 0.4932553 0.2250742 0.2271045
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Hope this helps,
> >>>>>>>>>>
> >>>>>>>>>> Rui Barradas
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Este e-mail foi analisado pelo software antivírus AVG para
> >>>>>>>>> verificar a presença de vírus.
> >>>>>>>>> www.avg.com
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> The two blocks of res are the two random matrices, one for each
> >>>>>>> combination of (a,b). mapply passes each of the values in its
> >> arguments
> >>>>>>> list (the ellipses in the help page) and computes the anonymous
> >>>>>>> function
> >>>>>>> with the pairs (a[1], b[1]), (a[2], b[2]).
> >>>>>>>
> >>>>>>> Since a and b are two elements vectors the output res is a two
> >> members
> >>>>>>> named list. Your error is to compare the result of apply(res2, 1,
> >> min)
> >>>>>>> to a, when you should compare to a[2]. See the code below.
> >>>>>>>
> >>>>>>>
> >>>>>>> library(Surrogate)
> >>>>>>> a <- c(0.015, 0.005)
> >>>>>>> b <- c(0.070, 0.045)
> >>>>>>> set.seed(2025)
> >>>>>>> res <- mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
> >>>>>>> MoreArgs = list(s = 0.05528650577311, n = 2, m =
> >>>>>>> 10000),
> >>>>>>> a, b)
> >>>>>>>
> >>>>>>> res1 = res[[1]]
> >>>>>>> res2 = res[[2]]
> >>>>>>>
> >>>>>>> # first check that the sums are correct
> >>>>>>> # these sums should be s = 0.05528650577311, up to floating-point
> >>>>>>> accuracy
> >>>>>>> lapply(res, \(x) colSums(x[, 1:5]) |> print(digits = 14L))
> >>>>>>> #> [1] 0.05528650577311 0.05528650577311 0.05528650577311
> >>>>>>> 0.05528650577311
> >>>>>>> #> [5] 0.05528650577311
> >>>>>>> #> [1] 0.05528650577311 0.05528650577311 0.05528650577311
> >>>>>>> 0.05528650577311
> >>>>>>> #> [5] 0.05528650577311
> >>>>>>> #> $RandVecOutput
> >>>>>>> #> [1] 0.05528651 0.05528651 0.05528651 0.05528651 0.05528651
> >>>>>>> #>
> >>>>>>> #> $RandVecOutput
> >>>>>>> #> [1] 0.05528651 0.05528651 0.05528651 0.05528651 0.05528651
> >>>>>>>
> >>>>>>> # now check the min and max
> >>>>>>> apply(res1, 1, min) > a[1L] ## [1] TRUE TRUE
> >>>>>>> #> [1] TRUE TRUE
> >>>>>>> apply(res2, 1, min) > a[2L] ## [1] TRUE TRUE
> >>>>>>> #> [1] TRUE TRUE
> >>>>>>>
> >>>>>>> apply(res1, 1, max) < b[1L] ## [1] TRUE TRUE
> >>>>>>> #> [1] TRUE TRUE
> >>>>>>> apply(res2, 1, max) < b[2L] ## [1] TRUE TRUE
> >>>>>>> #> [1] TRUE TRUE
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Which one should you take as final simulation of the vector? Both.
> >>>>>>> The first matrix values are bounded by c(a[1], b[1]) with column
> >> sums
> >>>>>>> equal to s.
> >>>>>>> The second matrix values are bounded by c(a[2], b[2]) with column
> >> sums
> >>>>>>> also equal to s.
> >>>>>>>
> >>>>>>> Hoep this helps,
> >>>>>>>
> >>>>>>> Rui Barradas
> >>>>>>>
> >>>>
> >>>> ______________________________________________
> >>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide https://www.R-project.org/posting-
> >>>> guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >>> --
> >>> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> >> presença de vírus.
> >>> www.avg.com
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> https://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list