If letters 1 and 2 must be equal with p=0.5, and 1 and 3 must be equal with p=0.5, then letter 1 must be the same as either 2 or 3. Therefore:
Choose a letter. Make a pair of (letter, (not letter)). Reverse the pair with p = 0.5 Concatenate your letter and the pair. Is that what you need? B. On Jun 2, 2015, at 8:26 AM, Benjamin Ward (ENV) <b.w...@uea.ac.uk> wrote: > Dear R-List, > > I have a set of possibilities I want to sample from: > > bases <- list(c('A', 'C'), c('A', 'G'), c('C', 'T')) > possibilities <- as.matrix(expand.grid(bases)) > >> possibilities > Var1 Var2 Var3 > [1,] "A" "A" "C" > [2,] "C" "A" "C" > [3,] "A" "G" "C" > [4,] "C" "G" "C" > [5,] "A" "A" "T" > [6,] "C" "A" "T" > [7,] "A" "G" "T" > [8,] "C" "G" "T" > > If I want to randomly sample one of these rows. If I do this, I find that it > is 25% likely that my choice will have an identical first and last letter > (e.g. [1,] "A" "A" "C"). It is also 25% likely that my choice will have an > identical first and third letter (e.g. [4,] "C" "G" "C"). It is not likely > at all that the second and third letter of my choice could be identical. > > What I would like to do, is sample one of the rows, but given the constraint > that the probability of drawing identical letters 1 and 2 should be 50% or > 0.5, and at the same time the probability of drawing identical letters 1 and > 3 should be 50%. I am unsure on how to do this, but I know it involves coming > up with a modified set of weights for the sample() function. My progress is > below, any advice is much appreciated. > > Best Wishes, > > Ben Ward, UEA. > > > So I have used the following code to come up with a matrix, which contains > weighting according to each criteria: > > possibilities <- as.matrix(expand.grid(bases)) > identities <- apply(possibilities, 1, function(x) c(x[1] == x[2], x[1] == > x[3], x[2] == x[3])) > prob <- matrix(rep(0, length(identities)), ncol = ncol(identities)) > consProb <- apply(identities, 1, function(x){0.5 / length(which(x))}) > polProb <- apply(identities, 1, function(x){0.5 / length(which(!x))}) > for(i in 1:nrow(identities)){ > prob[i, which(identities[i,])] <- consProb[i] > prob[i, which(!identities[i,])] <- polProb[i] > } > rownames(prob) <- c("1==2", "1==3", "2==3") > colnames(prob) <- apply(possibilities, 1, function(x)paste(x, collapse = ", > ")) > > This code gives the following matrix: > > A, A, C C, A, C A, G, C C, G, C A, A, > T C, A, T A, G, T C, G, T > 1==2 0.25000000 0.08333333 0.08333333 0.08333333 0.25000000 0.08333333 > 0.08333333 0.08333333 > 1==3 0.08333333 0.25000000 0.08333333 0.25000000 0.08333333 0.08333333 > 0.08333333 0.08333333 > 2==3 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 > 0.06250000 0.06250000 > > Each column is one of the choices from 'possibilities', and each row gives a > series of weights based on three different criteria: > > Row 1, that if it possible from the choices for letter 1 == letter 2, that > combined chance be 50%. > Row 2, that if it possible from the choices for letter 1 == letter 3, that > combined chance be 50%. > Row 3, that if it possible from the choices for letter 2 == letter 3, that > combined chance be 50%. > > So: > > If I used sample(x = 1:now(possibilities), size = 1, prob = prob[1,]) > repeatedly, I expect about half the choices to contain identical letters 1 > and 2. > > If I used sample(x = 1:now(possibilities), size = 1, prob = prob[2,]) > repeatedly, I expect about half the choices to contain identical letters 1 > and 3. > > If I used sample(x = 1:now(possibilities), size = 1, prob = prob[3,]) > repeatedly, I expect about half the choices to contain identical letters 2 > and 3. Except that in this case, since it is not possible. > > Note each row sums to 1. > > What I would like to do - if it is possible - is combine these three sets of > weights into one set, that when used with > sample(x = 1:nrow(possibilities, size = 1, prob = MAGICPROB) will give me a > list of choices, where ~50% of them contain identical letters 1 and 2, AND > ~50% of them contain identical letters 1 and 3, AND ~50% again contain > identical letters 2 and 3 (except in this example as it is not possible from > the choices). > > Can multiple probability weightings be combined in such a manner? > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.