Tim, do you know what you are talking about? If so, please do share sample code 
so we can follow along. If you pick nine uniformly distributed variables (the 
tenth is constrained) and reject sets that do not add to the desired sum then 
only (1/10)^9 or so of the draws will not be rejected. This is a vast amount of 
computation for a tiny fraction of valid draws. If you progressively pick from 
yet-unallocated values that don't exceed the total then the distributions of 
the variables will be very small for later variables considered (unequal 
distributions). There may indeed be slick algorithms for surmounting these 
problems but I cannot understand if you are outlining such a solution from your 
description.

On April 14, 2025 2:49:57 PM PDT, "Ebert,Timothy Aaron" <teb...@ufl.edu> wrote:
>
>1) select only denied individuals from the original data. This is S.
>2) There is a fixed sample size of exactly t
>3) There is a fixed target sum T such that sum(t values from S) = T
>
>You can reduce the problem. All large values where the max(S) + (t-1 smallest 
>values) > T can be eliminated from S. Likewise if (t-1 max S) + min(S) < T 
>then the smallest values can be eliminated.
>
>You can estimate the number of options by asking how many ways can I draw t 
>objects from the size of S with replacement. This is the size of S raised to 
>the t power. If Joe has a score of 6 and Kim has a score of 6 these are 
>different outcomes. If not, then S can be reduced to the number of unique 
>values.
>
>Could you randomly sample S, filter for the sum=T and then use the random 
>sample? Do you need every case when there are millions of cases (or more)?
>
>Is a long execution time important, or what do you consider a long execution 
>time? Are you planning on looking at all t and T within a range?
>
>Tim
>
>-----Original Message-----
>From: R-help <r-help-boun...@r-project.org> On Behalf Of Bert Gunter
>Sent: Monday, April 14, 2025 4:16 PM
>To: Duncan Murdoch <murdoch.dun...@gmail.com>
>Cc: r-help@r-project.org
>Subject: Re: [R] Drawing a sample based on certain condition
>
>[External Email]
>
>Just a comment:
>
>You wish to draw subsets of size n with or without replacement -- and I 
>suspect without replacement is simpler than with -- from a set of positive 
>integers that sum to a fixed value T. This sounds related to the so-called 
>subset sum problem in computational complexity theory: Given a set S of 
>positive integers and positive total, T, is there *any* subset of S (i.e. a 
>sample without replacement) whose sum is T? This is known to be NP-complete. 
>So this means that listing all such subsets must be NP-complete. I don't know 
>whether specifying that, as you have, the size of the subsets must be a fixed 
>number (obviously?) simplifies the problem sufficiently to make it 
>computationally tractable; computational wizards or suitable web searches 
>might tell you this. But for these reasons,  your query about how to sample 
>the set of all such subsets -- both those drawn with or without replacement -- 
>seems computationally difficult to me.
>
>
>Cheers,
>Bert
>
>
>
>"An educated person is one who can entertain new ideas, entertain others, and 
>entertain herself."
>
>
>
>On Mon, Apr 14, 2025 at 8:37 AM Duncan Murdoch <murdoch.dun...@gmail.com>
>wrote:
>
>> On 2025-04-14 7:26 a.m., Brian Smith wrote:
>> > Hi,
>> >
>> > For my analytical work, I need to draw a sample of certain sample
>> > size from a denied population, where population members are marked
>> > by non-negative integers, such that sum of sample members if fixed.
>> > For example,
>> >
>> > Population = 0:100
>> > Sample_size = 10
>> > Sample_Sum = 20
>> >
>> > Under this setup if my sample members are X1, X2, ..., X10 then I
>> > should have X1+X2+...+X10 = 20
>> >
>> > Sample drawing scheme may be with/without replacement
>> >
>> > Is there any R function to achieve this? One possibility is to
>> > employ naive trial-error approach, but this doesnt seem to be
>> > practical as it would take long time to get the final sample with desired 
>> > properties.
>> >
>> > Any pointer would be greatly appreciated.
>>
>> One general way to think of this problem is that you are defining a
>> distribution on the space of all possible samples of size 10, such
>> that the probability of a sample is X if the sum is 20, and zero
>> otherwise, and you want to sample from this distribution.
>>
>> There's probably a slick method to do that for your example, but if
>> you've got a general population instead of that special one, I doubt it.
>>
>> What I would do is the following:
>>
>> Define another distribution on samples that has probabilities that
>> depend on the sum of the sample, with the highest probabilities
>> attached to ones with the correct sum, and probabilities for other
>> sums declining with distance from the sum.  For example, maybe
>>
>>   P(sum) = Y/(1 + abs(sum - 20))
>>
>> for some constant Y.
>>
>> You can use MCMC to sample from that distribution and then only keep
>> the samples where the sum is exactly equal to the target sum.  If you
>> do that, you don't need to care about the value of Y. but you do need
>> to think about how proposed moves are made, and you probably need to
>> use a different function than the example above for acceptable efficiency.
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat/
>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
>> %7C722a00e77d1343db507d08dd7b914147%7C0d4da0f84a314d76ace60a62331e1b84
>> %7C0%7C0%7C638802585985433701%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki
>> OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ
>> %3D%3D%7C0%7C%7C%7C&sdata=KNgaZ7e92Zfs51%2F%2B%2B9%2BQkGI5hiSj9OzhDtFz
>> iWXx2D0%3D&reserved=0
>> PLEASE do read the posting guide
>> https://www/.
>> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7C7
>> 22a00e77d1343db507d08dd7b914147%7C0d4da0f84a314d76ace60a62331e1b84%7C0
>> %7C0%7C638802585985456014%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy
>> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%
>> 3D%7C0%7C%7C%7C&sdata=Vhraq88ySxlq4IUEqvoeRvNdhjvliYdz7VDDvmQzmkE%3D&r
>> eserved=0 and provide commented, minimal, self-contained, reproducible
>> code.
>>
>
>        [[alternative HTML version deleted]]
>
>______________________________________________
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.r-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>______________________________________________
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to