Re: [R] Generate random vectors (continuous number) with a fixed sum

Duncan Murdoch Sat, 26 Apr 2025 13:29:30 -0700

If a uniform sample is needed, it's fairly easy to turn my solution intoa rejection sampler that is uniform across the region satisfying theconstraints. It has an acceptance rate that is bounded below bysomething like 1/factorial(dim-1), e.g. 1/6 in 4 dimensions.


Duncan Murdoch


On 2025-04-26 3:18 p.m., Bert Gunter wrote:

Yes, thanks. I got my algebra/constraints wrong.
... Sigh...

Unfortunatey, this seems to make things yet more difficult.

-- Bert

"An educated person is one who can entertain new ideas, entertainothers, and entertain herself."

On Sat, Apr 26, 2025 at 10:01 AM Duncan Murdoch<murdoch.dun...@gmail.com <mailto:murdoch.dun...@gmail.com>> wrote:


    On 2025-04-24 3:25 p.m., Bert Gunter wrote:
     > Folks:
     >
     > Unless my wee old brain errs (always a danger), uniform sampling
    from an
     > n-vector <xi> for which 0 <= ai <= xi <= bi and SUM(xi) = k,  a
    constant,
     > where to ensure that the constraints are consistent (and nontrivial),
     > SUM(ai)< k and SUM(bi) > k, is a simple linear transformation
    (details left
     > to the reader) of uniform sampling from a standard n-1 dim
    simplex, for
     > which a web search yielded:
     >
    https://cs.stackexchange.com/questions/3227/uniform-sampling-from-a-simplex 
<https://cs.stackexchange.com/questions/3227/uniform-sampling-from-a-simplex>

    I think for n > 2 it is sometimes a simplex, and sometimes a different
    shape.  For example, with n = 3 and all a_i = 0 and all b_i = 1, it's a
    simplex if k < 1 or k > 2, but not for 1 < k < 2, where it's a hexagon.

    You can visualize this with the following code:

    x <- matrix(runif(30000), ncol = 3) # the full cube
    x0 <- x[1.45 < sums & sums < 1.55,] # points near k = 1.5
    rgl::plot3d(x0)


     > The proposed "solutions" in this thread which rejection sample
    sequentially
     > from uniforms do *not* produce a uniform distribution on the
    simplex, as
     > the link and references therein explain.

    My solution didn't do rejection sampling, but it won't give a uniform
    density.  I didn't realize that was requested.

     >
     > Apologies if I have misunderstood the problem and proposed solutions,
     > although there does seem to be some confusion in the thread about
    exactly
     > what was sought.

    Yes, that's true.

    Duncan Murdoch

     >
     > Cheers,
     > Bert
     >
     > "An educated person is one who can entertain new ideas, entertain
    others,
     > and entertain herself."
     >
     >
     >
     > On Thu, Apr 24, 2025 at 10:33 AM Brian Smith
    <briansmith199...@gmail.com <mailto:briansmith199...@gmail.com>>
     > wrote:
     >
     >> Hi Rui,
     >>
     >> This code is able to generate absolutely correct random sample
    vector
     >> based on the applicable constraints.
     >>
     >> I have one question though.
     >>
     >> If I understood the R code correctly then, the first element is
     >> drawing random number from Uniform distribution unconditionally,
     >> however drawing of sample point for the second element is
    conditional
     >> to the first one. Therefore if we have large vector size instead of
     >> current 2, I guess the feasible region for the last few elements
    will
     >> be very small.
     >>
     >> Will that be any problem? does there any algorithm exist where all
     >> (n-1) elements would be drawn unconditionally assuming our
    vector has
     >> n elements?
     >>
     >> Thanks and regards,
     >>
     >>
     >>
     >> On Wed, 23 Apr 2025 at 10:57, Rui Barradas <ruipbarra...@sapo.pt
    <mailto:ruipbarra...@sapo.pt>> wrote:
     >>>
     >>> Hello,
     >>>
     >>> Here are your tests and the random numbers' histograms.
     >>>
     >>>
     >>> one_vec <- function(a, b, s) {
     >>>     repeat{
     >>>       repeat{
     >>>         u <- runif(1, a[1], b[1])
     >>>         if(s - u > 0) break
     >>>       }
     >>>       v <- s - u
     >>>       if(a[2] < v && v < b[2]) break
     >>>     }
     >>>     c(u, v)
     >>> }
     >>> gen_mat <- function(m, a, b, s) {
     >>>     replicate(m, one_vec(a, b, s))
     >>> }
     >>>
     >>> a <- c(0.015, 0.005)
     >>> b <- c(0.070, 0.045)
     >>> s <- 0.05528650577311
     >>> m <- 10000L
     >>>
     >>> set.seed(2025)
     >>> res <- gen_mat(m, a, b, s)
     >>> apply(res, 1, min) > a
     >>> #> [1] TRUE TRUE
     >>> apply(res, 1, max) < b
     >>> #> [1] TRUE TRUE
     >>>
     >>> # plot histograms of one million 2d vectors
     >>> system.time(
     >>>     res1mil <- gen_mat(1e6, a, b, s)
     >>> )
     >>> #>    user  system elapsed
     >>> #>    3.01    0.06    3.86
     >>>
     >>> old_par <- par(mfrow = c(1, 2))
     >>> hist(res1mil[1L,])
     >>> hist(res1mil[2L,])
     >>> par(old_par)
     >>>
     >>>
     >>> Hope this helps,
     >>>
     >>> Rui Barradas
     >>>
     >>> Às 23:31 de 22/04/2025, Rui Barradas escreveu:
     >>>> Hello,
     >>>>
     >>>> Inline.
     >>>>
     >>>> Às 17:55 de 22/04/2025, Brian Smith escreveu:
     >>>>> i.e. we should have
     >>>>>
     >>>>> all elements of Reduce("+", res) should be equal to  s =
     >> 0.05528650577311
     >>>>>
     >>>>> My assertion is that it is not happing here.
     >>>>
     >>>> You are right, that's not what is happening. The output is n
    vectors of
     >>>> 2 elements each. It's each of these vectors that add up to s.
     >>>> Appparently I misunderstood the problem.
     >>>>
     >>>> Maybe this is what you want?
     >>>> (There is no n argument, the matrix is always 2*m)
     >>>>
     >>>>
     >>>> one_vec <- function(a, b, s) {
     >>>>     repeat{
     >>>>       repeat{
     >>>>         u <- runif(1, a[1], b[1])
     >>>>         if(s - u > 0) break
     >>>>       }
     >>>>       v <- s - u
     >>>>       if(a[2] < v && v < b[2]) break
     >>>>     }
     >>>>     c(u, v)
     >>>> }
     >>>> gen_mat <- function(m, a, b, s) {
     >>>>     replicate(m, one_vec(a, b, s))
     >>>> }
     >>>>
     >>>> set.seed(2025)
     >>>> res <- gen_mat(10000, a, b, s)
     >>>> colSums(res)
     >>>>
     >>>>
     >>>> Hope this helps,
     >>>>
     >>>> Rui Barradas
     >>>>
     >>>>
     >>>>>
     >>>>>
     >>>>> On Tue, 22 Apr 2025 at 22:20, Brian Smith
    <briansmith199...@gmail.com <mailto:briansmith199...@gmail.com>
     >>>
     >>>>> wrote:
     >>>>>>
     >>>>>> Hi Rui,
     >>>>>>
     >>>>>> Thanks for the explanation.
     >>>>>>
     >>>>>> But in this case, are we looking at the correct solution at all?
     >>>>>>
     >>>>>> My goal is to generate random vector where:
     >>>>>> 1) the first element is bounded by (a[1], b[1]) and second
    element is
     >>>>>> bounded by (a[2], b[2])
     >>>>>> 2) sum of the element is s
     >>>>>>
     >>>>>> According to the outcome,
     >>>>>> The first matrix values are bounded by c(a[1], b[1]) &
    second matrix
     >>>>>> values are bounded by c(a[2], b[2])
     >>>>>>
     >>>>>> But,
     >>>>>> regarding the sum, I think we should have sum (element-wise) sum
     >>>>>> should be equal to s = 0.05528650577311.
     >>>>>>
     >>>>>> How could we achieve that then?
     >>>>>>
     >>>>>> On Tue, 22 Apr 2025 at 22:03, Rui Barradas
    <ruipbarra...@sapo.pt <mailto:ruipbarra...@sapo.pt>>
     >> wrote:
     >>>>>>>
     >>>>>>> Às 12:39 de 22/04/2025, Brian Smith escreveu:
     >>>>>>>> Hi Rui,
     >>>>>>>>
     >>>>>>>> Many thanks for your time and insight.
     >>>>>>>>
     >>>>>>>> However, I am not sure if I could understand the code.
    Below is
     >> what I
     >>>>>>>> tried based on your code
     >>>>>>>>
     >>>>>>>> library(Surrogate)
     >>>>>>>> a <- c(0.015, 0.005)
     >>>>>>>> b <- c(0.070, 0.045)
     >>>>>>>> set.seed(2025)
     >>>>>>>> res <- mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
     >>>>>>>>                   MoreArgs = list(s = 0.05528650577311, n
    = 2, m =
     >>>>>>>> 10000), a, b)
     >>>>>>>>
     >>>>>>>> res1 = res[[1]]
     >>>>>>>> res2 = res[[2]]
     >>>>>>>>
     >>>>>>>> apply(res1, 1, min) > a   ## [1] TRUE TRUE
     >>>>>>>> apply(res2, 1, min) > a   ## [1] FALSE  TRUE
     >>>>>>>>
     >>>>>>>> I could not understand what basically 2 blocks of res signify?
     >> Which
     >>>>>>>> one I should take as final simulation of the vector? If I
    take the
     >>>>>>>> first block then the lower bound condition is fulfilled,
    but not
     >> with
     >>>>>>>> the second block. However with the both blocks, the total
    equals s
     >> is
     >>>>>>>> satisfying.
     >>>>>>>>
     >>>>>>>> I appreciate your further insight.
     >>>>>>>>
     >>>>>>>> Thanks and regards,
     >>>>>>>>
     >>>>>>>> On Mon, 21 Apr 2025 at 20:43, Rui Barradas
    <ruipbarra...@sapo.pt <mailto:ruipbarra...@sapo.pt>>
     >>>>>>>> wrote:
     >>>>>>>>>
     >>>>>>>>> Hello,
     >>>>>>>>>
     >>>>>>>>> Inline.
     >>>>>>>>>
     >>>>>>>>> Às 16:08 de 21/04/2025, Rui Barradas escreveu:
     >>>>>>>>>> Às 15:27 de 21/04/2025, Brian Smith escreveu:
     >>>>>>>>>>> Hi,
     >>>>>>>>>>>
     >>>>>>>>>>> There is a function called RandVec in the package Surrogate
     >>>>>>>>>>> which can
     >>>>>>>>>>> generate andom vectors (continuous number) with a fixed sum
     >>>>>>>>>>>
     >>>>>>>>>>> The help page of this function states that:
     >>>>>>>>>>>
     >>>>>>>>>>> a
     >>>>>>>>>>>
     >>>>>>>>>>> The function RandVec generates an n by m matrix x. Each
    of the m
     >>>>>>>>>>> columns contain n random values lying in the interval
    [a,b]. The
     >>>>>>>>>>> argument a specifies the lower limit of the interval.
    Default 0.
     >>>>>>>>>>>
     >>>>>>>>>>> b
     >>>>>>>>>>>
     >>>>>>>>>>> The argument b specifies the upper limit of the interval.
     >>>>>>>>>>> Default 1.
     >>>>>>>>>>>
     >>>>>>>>>>> However in my case, the lower and upper limits are not
    same. For
     >>>>>>>>>>> example, if I need to draw a pair of number x, y, such
    that x +
     >>>>>>>>>>> y = 1,
     >>>>>>>>>>> then the lower and upper limits are different.
     >>>>>>>>>>>
     >>>>>>>>>>> I tried with below code
     >>>>>>>>>>>
     >>>>>>>>>>> library(Surrogate)
     >>>>>>>>>>>
     >>>>>>>>>>> RandVec(a=c(0.1, 0.2), b=c(0.2, 0.8), s=1, n=2,
     >> m=5)$RandVecOutput
     >>>>>>>>>>>
     >>>>>>>>>>> This generates error with message
     >>>>>>>>>>>
     >>>>>>>>>>> Error in if (b - a == 0) { : the condition has length > 1
     >>>>>>>>>>>
     >>>>>>>>>>> Is there any way to generate the numbers with different
    lower
     >> and
     >>>>>>>>>>> upper limits?
     >>>>>>>>>>>
     >>>>>>>>>>> ______________________________________________
     >>>>>>>>>>> R-help@r-project.org <mailto:R-help@r-project.org>
    mailing list -- To UNSUBSCRIBE and more,
     >> see
     >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
    <https://stat.ethz.ch/mailman/listinfo/r-help>
     >>>>>>>>>>> PLEASE do read the posting guide
     >> https://www.R-project.org/posting-
    <https://www.R-project.org/posting->
     >>>>>>>>>>> guide.html
     >>>>>>>>>>> and provide commented, minimal, self-contained,
    reproducible
     >> code.
     >>>>>>>>>> Hello,
     >>>>>>>>>>
     >>>>>>>>>> Use ?mapply to cycle through all values of a and b.
     >>>>>>>>>> Note that the output matrices are transposed, the random
    vectors
     >>>>>>>>>> are the
     >>>>>>>>>> rows.
     >>>>>>>>> Sorry, this is not true. The columns are the random
    vectors, as
     >>>>>>>>> documented. An example setting the RNG seed, for
    reproducibility.
     >>>>>>>>>
     >>>>>>>>>
     >>>>>>>>> library(Surrogate)
     >>>>>>>>>
     >>>>>>>>> a <- c(0.1, 0.2)
     >>>>>>>>> b <- c(0.2, 0.8)
     >>>>>>>>> set.seed(2025)
     >>>>>>>>> res <- mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
     >>>>>>>>>                   MoreArgs = list(s = 1, n = 2, m = 5), a, b)
     >>>>>>>>>
     >>>>>>>>> res
     >>>>>>>>> #> $RandVecOutput
     >>>>>>>>> #>          [,1]      [,2]      [,3]     [,4]      [,5]
     >>>>>>>>> #> [1,] 0.146079 0.1649319 0.1413759 0.257086 0.1715478
     >>>>>>>>> #> [2,] 0.253921 0.2350681 0.2586241 0.142914 0.2284522
     >>>>>>>>> #>
     >>>>>>>>> #> $RandVecOutput
     >>>>>>>>> #>           [,1]      [,2]      [,3]      [,4]      [,5]
     >>>>>>>>> #> [1,] 0.5930918 0.2154583 0.6915523 0.7167089 0.3617918
     >>>>>>>>> #> [2,] 0.4069082 0.7845417 0.3084477 0.2832911 0.6382082
     >>>>>>>>>
     >>>>>>>>> lapply(res, colSums)
     >>>>>>>>> #> $RandVecOutput
     >>>>>>>>> #> [1] 0.4 0.4 0.4 0.4 0.4
     >>>>>>>>> #>
     >>>>>>>>> #> $RandVecOutput
     >>>>>>>>> #> [1] 1 1 1 1 1
     >>>>>>>>>
     >>>>>>>>>
     >>>>>>>>> Hope this helps,
     >>>>>>>>>
     >>>>>>>>> Rui Barradas
     >>>>>>>>>>
     >>>>>>>>>>
     >>>>>>>>>> library(Surrogate)
     >>>>>>>>>>
     >>>>>>>>>> a <- c(0.1, 0.2)
     >>>>>>>>>> b <- c(0.2, 0.8)
     >>>>>>>>>> mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
     >>>>>>>>>>            MoreArgs = list(s = 1, n = 2, m = 5), a, b)
     >>>>>>>>>> #> $RandVecOutput
     >>>>>>>>>> #>           [,1]      [,2]      [,3]      [,4]      [,5]
     >>>>>>>>>> #> [1,] 0.2004363 0.1552328 0.2391742 0.1744857 0.1949236
     >>>>>>>>>> #> [2,] 0.1995637 0.2447672 0.1608258 0.2255143 0.2050764
     >>>>>>>>>> #>
     >>>>>>>>>> #> $RandVecOutput
     >>>>>>>>>> #>           [,1]      [,2]      [,3]      [,4]      [,5]
     >>>>>>>>>> #> [1,] 0.2157416 0.4691191 0.5067447 0.7749258 0.7728955
     >>>>>>>>>> #> [2,] 0.7842584 0.5308809 0.4932553 0.2250742 0.2271045
     >>>>>>>>>>
     >>>>>>>>>>
     >>>>>>>>>> Hope this helps,
     >>>>>>>>>>
     >>>>>>>>>> Rui Barradas
     >>>>>>>>>>
     >>>>>>>>>>
     >>>>>>>>>
     >>>>>>>>>
     >>>>>>>>> --
     >>>>>>>>> Este e-mail foi analisado pelo software antivírus AVG para
     >>>>>>>>> verificar a presença de vírus.
     >>>>>>>>> www.avg.com <http://www.avg.com>
     >>>>>>> Hello,
     >>>>>>>
     >>>>>>> The two blocks of res are the two random matrices, one for each
     >>>>>>> combination of (a,b). mapply passes each of the values in its
     >> arguments
     >>>>>>> list (the ellipses in the help page) and computes the anonymous
     >>>>>>> function
     >>>>>>> with the pairs (a[1], b[1]), (a[2], b[2]).
     >>>>>>>
     >>>>>>> Since a and b are two elements vectors the output res is a two
     >> members
     >>>>>>> named list. Your error is to compare the result of
    apply(res2, 1,
     >> min)
     >>>>>>> to a, when you should compare to a[2]. See the code below.
     >>>>>>>
     >>>>>>>
     >>>>>>> library(Surrogate)
     >>>>>>> a <- c(0.015, 0.005)
     >>>>>>> b <- c(0.070, 0.045)
     >>>>>>> set.seed(2025)
     >>>>>>> res <- mapply(\(a, b, s, n, m) RandVec(a, b, s, n, m),
     >>>>>>>                  MoreArgs = list(s = 0.05528650577311, n =
    2, m =
     >>>>>>> 10000),
     >>>>>>> a, b)
     >>>>>>>
     >>>>>>> res1 = res[[1]]
     >>>>>>> res2 = res[[2]]
     >>>>>>>
     >>>>>>> # first check that the sums are correct
     >>>>>>> # these sums should be s = 0.05528650577311, up to
    floating-point
     >>>>>>> accuracy
     >>>>>>> lapply(res, \(x) colSums(x[, 1:5]) |> print(digits = 14L))
     >>>>>>> #> [1] 0.05528650577311 0.05528650577311 0.05528650577311
     >>>>>>> 0.05528650577311
     >>>>>>> #> [5] 0.05528650577311
     >>>>>>> #> [1] 0.05528650577311 0.05528650577311 0.05528650577311
     >>>>>>> 0.05528650577311
     >>>>>>> #> [5] 0.05528650577311
     >>>>>>> #> $RandVecOutput
     >>>>>>> #> [1] 0.05528651 0.05528651 0.05528651 0.05528651 0.05528651
     >>>>>>> #>
     >>>>>>> #> $RandVecOutput
     >>>>>>> #> [1] 0.05528651 0.05528651 0.05528651 0.05528651 0.05528651
     >>>>>>>
     >>>>>>> # now check the min and max
     >>>>>>> apply(res1, 1, min) > a[1L]   ## [1] TRUE TRUE
     >>>>>>> #> [1] TRUE TRUE
     >>>>>>> apply(res2, 1, min) > a[2L]   ## [1] TRUE TRUE
     >>>>>>> #> [1] TRUE TRUE
     >>>>>>>
     >>>>>>> apply(res1, 1, max) < b[1L]   ## [1] TRUE TRUE
     >>>>>>> #> [1] TRUE TRUE
     >>>>>>> apply(res2, 1, max) < b[2L]   ## [1] TRUE TRUE
     >>>>>>> #> [1] TRUE TRUE
     >>>>>>>
     >>>>>>>
     >>>>>>>
     >>>>>>> Which one should you take as final simulation of the
    vector? Both.
     >>>>>>> The first matrix values are bounded by c(a[1], b[1]) with
    column
     >> sums
     >>>>>>> equal to s.
     >>>>>>> The second matrix values are bounded by c(a[2], b[2]) with
    column
     >> sums
     >>>>>>> also equal to s.
     >>>>>>>
     >>>>>>> Hoep this helps,
     >>>>>>>
     >>>>>>> Rui Barradas
     >>>>>>>
     >>>>
     >>>> ______________________________________________
     >>>> R-help@r-project.org <mailto:R-help@r-project.org> mailing
    list -- To UNSUBSCRIBE and more, see
     >>>> https://stat.ethz.ch/mailman/listinfo/r-help
    <https://stat.ethz.ch/mailman/listinfo/r-help>
     >>>> PLEASE do read the posting guide
    https://www.R-project.org/posting- <https://www.R-project.org/posting->
     >>>> guide.html
     >>>> and provide commented, minimal, self-contained, reproducible code.
     >>>
     >>>
     >>> --
     >>> Este e-mail foi analisado pelo software antivírus AVG para
    verificar a
     >> presença de vírus.
     >>> www.avg.com <http://www.avg.com>
     >>
     >> ______________________________________________
     >> R-help@r-project.org <mailto:R-help@r-project.org> mailing list
    -- To UNSUBSCRIBE and more, see
     >> https://stat.ethz.ch/mailman/listinfo/r-help
    <https://stat.ethz.ch/mailman/listinfo/r-help>
     >> PLEASE do read the posting guide
     >> https://www.R-project.org/posting-guide.html
    <https://www.R-project.org/posting-guide.html>
     >> and provide commented, minimal, self-contained, reproducible code.
     >>
     >
     >       [[alternative HTML version deleted]]
     >
     > ______________________________________________
     > R-help@r-project.org <mailto:R-help@r-project.org> mailing list
    -- To UNSUBSCRIBE and more, see
     > https://stat.ethz.ch/mailman/listinfo/r-help
    <https://stat.ethz.ch/mailman/listinfo/r-help>
     > PLEASE do read the posting guide
    https://www.R-project.org/posting-guide.html
    <https://www.R-project.org/posting-guide.html>
     > and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate random vectors (continuous number) with a fixed sum

Reply via email to