Chuck,

Thanks for the reference.  MCMC with "swapping" was my original goal and I
think I'll go with that in the long run--although using sample() has worked
out for now.  I was initially concerned that checking all choose(n,2)
possible swaps would slow down the process, but in my case for choose(30,2)
= 435 this seems not unreasonable.  I wonder if for larger vectors an
alternative would be desired.

Thanks to everyone for your help--
Andy


On Fri, Jan 29, 2010 at 8:23 PM, Charles C. Berry <cbe...@tajo.ucsd.edu>wrote:

> On Fri, 29 Jan 2010, Andrew Rominger wrote:
>
>  Being reasonably sure that all valid permutations are equally probable is
>> important to me.  I've played around with search algorithms in permuting
>> contingency tables and find that possible solutions decrease rapidly once
>> one starts assigning values, particularly if small values are assigned
>> first, so it would seem all solutions are not equally probable (not only
>> that but one frequently encounters "dead ends" where there are values left
>> to assign and no allowable place to put them).  As such I think I'd opt to
>> use sample()... several times if needed.
>>
>> To clarify, yes, I only need one valid permutation, the idea is I'll
>> generate 1000s of ordered vectors, and then for each one generate one
>> valid
>> permutation.
>>
>> Thanks very much for the help and insights--
>> Andy
>>
>
> Andy,
>
> If you have some sense of importance sampling and/or MCMC you might look at
>
> Zaman and Simberloff (2002, Environmental and Ecological Statistics 9,
> 405--421).
>
> which concerns sampling a binary matrix with fixed margins - not quite your
> problem, but akin to it in being a combinatorial nightmare without an
> obvious direct solution of workable size for real problems.
>
> They define a neighborhood for each allowable matrix s.t. swapping a pair
> of 1's at ij and kl with a pair of 0's at il and kj  (which doesn't violate
> the margin constraints) leads to a member of the neighborhood. IIRC, the
> size of the neighborhood and the sizes of the neighborhoods of the members
> of its neighborhood determine the probabilities of staying put or moving to
> get the next element of the MCMC chain and which member of the neighborhood
> to choose.
>
> I suppose something like that (i.e. defining neighborhoods of allowable
> permutations, measuring their size, and using this to guide sampling or
> develop importance weights) might apply in your case. Maybe something like
> this: start with an ordering of your n-vector that conforms to the
> constraints, look at all the choose(n,2) pairs of elements and check which
> of them could be exchanged to yield another conforming ordering; the
> allowable swaps define the neighborhood, and their number is its size.
>
> So, the idea is to develop an MCMC sampler. Run it for each ordered vector
> to get past the burn in, then take one value from it.
>
> HTH,
>
> Chuck
>
>
>>
>> On Thu, Jan 28, 2010 at 3:04 PM, Thomas Lumley <tlum...@u.washington.edu
>> >wrote:
>>
>>  On Thu, 28 Jan 2010, Jason Smith wrote:
>>>
>>>  It wouldn't be guaranteed to produce any usable permutation, but it
>>> seems
>>>
>>>> like it would be much faster and so could be repeated until an
>>>>> acceptable
>>>>> vector is found.  What do you think?
>>>>>
>>>>> Thanks--
>>>>> Andy
>>>>>
>>>>>
>>>>>  I think I am not understanding what your ultimate goal is so I'm not
>>>> sure I can give you appropriate advice.  Are you looking for a single
>>>> valid permutation or all of them?
>>>>
>>>> Since that constraint sets a ceiling on each subsequent value, it
>>>> seems like you could solve this problem more easily and quickly by
>>>> using a search strategy instead of random sampling or generating all
>>>> permutations then testing.  The constraint will help prune the search
>>>> space so you only generate valid permutations.  Once you are examining
>>>> a particular element you can determine which of the additional
>>>> elements would be valid, so only consider those.
>>>>
>>>>
>>> It's easy to generate valid permutations this way.  It does not appear
>>> straightforward to ensure that all valid permutations are sampled with
>>> equal
>>> probability, which I thought was part of the specification of the
>>> problem.
>>>
>>>     -thomas
>>>
>>>
>>> Thomas Lumley                   Assoc. Professor, Biostatistics
>>> tlum...@u.washington.edu        University of Washington, Seattle
>>>
>>>
>>        [[alternative HTML version deleted]]
>>
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> Charles C. Berry                            (858) 534-2098
>                                            Dept of Family/Preventive
> Medicine
> E mailto:cbe...@tajo.ucsd.edu               UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to