On Tue, 2 Oct 2007, Paul Johnson wrote:
Jérôme,
As a first attempt, how about the function below. It works (or not) by
randomly sorting the rows and columns, then searching the table for
"squares" with the corners = matrix(c(1,0,0,1),ncol=2) and subtracting them
from 1 to give matrix(c(0,1,1,0),ncol=2) (and vice versa). Randomized
matrices can be produced as a chain where each permutation is seeded with
the previous one.
Potential problems:
1. Does it really explore all possible permutations?
Yes. If you randomize. Not sure about 'only if you randomize'.
Does it do it in an unbiased way?
No. If you mean are all choices equiprobable. So, you either need to add
importance weights or do a rejection step.
Zaman and Simberloff (2002, Environmental and Ecological Statistics 9,
405--421) go through this and propose methods for the problem.
They give an example of a binary matrix with margins (2, 1, 1) and (1, 2,
1). There are five matrices that satisfy the constraints that the
margins are as specified and the cells are either zero or one. Study of
that simple setup is revealing of the problems encountered in developing
sampling schemes for the general problem.
I recall they have a C program for the sampler they propose. Probably the
path of least resistance is to get hold of that.
2. Related to above: there is potential autocorrelation (although I haven't
found it with my data set), so might need some dememorisation steps.
3. It's slow and dememorisation makes it slower.
4. It isn't clear to me whether it needs the added stochastic element, i.e.
squares are only flipped if "runif(1)<0.5". In practice it seems to work
without it (as far as I can tell, i.e. it isn't autocorrelated using my data
set).
I suspect there's a much better way of doing this, which might take the
margins as an input, and therefore wouldn't be directly derived from any
particular matrix structure.
I thought that, too. You can see my _retraction_ here:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/98897.html
The difficulty is that under the constraints and the relevant null
hypothesis all matrices are equiprobable. It is easy to generate random
matrices satisfying all constraints, but not the null.
Brute force combinatorics is not attractive either. The number of binary
matrices can be very large for problems of practical size. As an example,
if you have a 20 by 20 binary table with two margins of rep(10,20), there
are many more than 2^100 possible tables. There is a straightforward way
of enumerating them, but it is not computationally feasible.
HTH,
Chuck
Paul
###############################################################
# function to permute binary matrix keeping margins fixed.
# the input "x" is the binary matrix to be permuted
permute.struct<-
function(x)
{
x<-x[rownames(x)[sample(1:nrow(x))],colnames(x)[sample(1:ncol(x))]]
pattern<-c(1,0,0,1)
for(i in 1:(nrow(x)-1))
for(j in 1:(ncol(x)-1))
for(ii in (i+1):nrow(x))
for(jj in (j+1):ncol(x))
{
query<-c(x[c(i,ii),c(j,jj)])
if((all(query-pattern==0) || all(query==1-pattern)) &&
runif(1)<0.5)
x[c(i,ii),c(j,jj)] <- 1 - x[c(i,ii),c(j,jj)]
}
x
}
###############################################################
--------------------------------------------------------
From: Mathieu Jérôme <jerome.mathieu_at_snv.jussieu.fr>
Date: Thu 05 Apr 2007 - 13:34:01 GMT
Dear R Users,
How can I randomize a matrix (with contains only 0 and 1) with the
constraint that margin totals (columns and rows sums) are kept constant?
Some have called this "swap permutation" or something like that.
The principle is to find bloc of
10
01
and change it for
01
10
there can be several rows or columns between these numbers. It probably
exists in R, but I can't find the function. I am aware of permcont, but it
works only for 2x2 matrices
thanks in advance
Jerome
--
Jérôme Mathieu
Laboratory of soil tropical ecology
UPMC
[EMAIL PROTECTED]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED] UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.