Re: [R] Small p from binomial probability function.

Benjamin Ward (ENV) Wed, 16 Oct 2013 06:22:26 -0700

Hi,

Thanks again for your answers, just so as I can get clear what is happening, 
with the uniroot method, I'm defining a function in which the binomial 
probability function pbinom is present but in addition p0 is subtracted from 
the result - in this case p0 is the large P I want to plug in so 0.05, 0.50 and 
0.95, or even just 0.05 and 0.95? Then uniroot finds the root of this function 
and doing so find me the small p I need?


Best,
Ben.


________________________________________
From: Rolf Turner [rolf.tur...@vodafone.co.nz]
Sent: 11 October 2013 02:11
To: Benjamin Ward (ENV)
Cc: Stefan Evert; R-help Mailing List
Subject: Re: [R] Small p from binomial probability function.

It is mysterious to me why the procedure proposed by Stefan Evert works.
It appears to work --- once you modify the call to binom.test() to have the
correct syntax.  In a sequence of 1000  trials with random values of N, x,
and p0, the answers from Evert's procedure agreed with the answer given
by uniroot() to within +/- 3.045e-05.

However your question was (in effect) how to solve the equation

     Pr(X <= x) = p0

for p, where X ~ Binom(N,p), with N and x known.  What this has to do with
confidence intervals for p is, to my mind at least, completely opaque.
In contrast it is obvious why the procedure using uniroot() works.

I would suggest that you stick with the uniroot() procedure in that it is
readily comprehensible.

     cheers,

     Rolf Turner

On 10/11/13 03:56, Benjamin Ward (ENV) wrote:
> Hi,
>
> Thank you for your answers, I'm not completely sure if it's to bino.test I 
> need or the uniroot. Perhaps I should explain more the idea behind the code 
> and the actual task I'm trying to do. The idea is to calculate a confidence 
> interval as to the age of two DNA sequences which have diverged, where I know 
> the number of mutations that happened in them, and I know the mutation rate.
>
> The binomial probability can be used since, mutations have a probability of 
> occurring or being observed so many times in a sequence. This is dependent on 
> the length of the DNA stretch (which equates to the number of trials since 
> each base is a possibility of observing a mutation), the probability of a 
> single mutation occurring which is p = t * u, since more time means a higher 
> probability a mutation may have occurred.
>
> So my code, using pbinom, is supposed to calculate the probability that my 
> DNA stretches contain the number of mutations observed P(X = k), given their 
> size (trials) and the probability of a single mutation (p = t * u). However 
> I'm interested in finding t: t is what is unknown, so the loop repeatedly 
> evaluates the calculation, increasing t each time and checking P(X=k), when 
> it is 0.05, 0.50 and 0.95, we record t.
>
> Ideally I'd like to rearrange this so I can get the probability of a single 
> success (mutation) p, and then divide by the mutation rate to get my t. My 
> supervisor gave my the loopy code but I imagine there is a way to plug in 
> P(X=k) as 0.05 and 0.95 and get my upper and lower t estimates.
>
> According to the R built in docs:
>
> binom.test
> Description:
>
>       Performs an exact test of a simple null hypothesis about the
>       probability of success in a Bernoulli experiment.
>
> Perhaps this is the one I need rather than uniroot?
>
> Best,
> Ben.
>
>
> ________________________________________
> From: Stefan Evert [stefa...@collocations.de]
> Sent: 10 October 2013 09:37
> To: R-help Mailing List
> Cc: Benjamin Ward (ENV)
> Subject: Re: [R] Small p from binomial probability function.
>
> Sounds like you want a 95% binomial confidence interval:
>
>          binom.test(N, P)
>
> will compute this for you, and you can get the bounds directly with
>
>          binom.test(N, P)$conf.int
>
> Actually, binom.test computes a two-sided confidence interval, which 
> corresponds roughly to 2.5 and 97.5 percentages in your approach. It doesn't 
> give you the 50% point either, but I don't think that's a meaningful quantity 
> with a two-sided test.
>
> Hope this helps,
> Stefan
>
>
> On 9 Oct 2013, at 15:53, Benjamin Ward (ENV) <b.w...@uea.ac.uk> wrote:
>
>> I got given some code that uses the R function pbionom:
>>
>> p <- mut * t
>> sumprobs <- pbinom( N, B, p ) * 1000
>>
>> Which gives the output of a probability as a percentage like 5, 50, 95.
>>
>> What the code currently does is find me the values of t I need, by using the 
>> above two code lines in a loop, each iteration it increaces t by one and 
>> runs the two lines. When sumprobs equals 5, it records the value t, then 
>> again when sumprobs is equal to 50, and again when sumprobs is equal to 95 - 
>> giving me three t values. This is not an efficient way of doing this if t is 
>> large. Is it possible to rearrange pbinom so it gives me the small p (made 
>> of mut*t) as the result of plugging in the sumprobs instead, and is there an 
>> R function that already does this?
>>
>> Since pbinom is the binomial probability equation I suppose the question is 
>> - in more mathematical terminology - can I change this code so that instead 
>> of calculating the Probability of N successes given the number of trials and 
>> the probability of a single success, can I instead calculate the probability 
>> of a single success using the probability of N successes and number of 
>> trials, and the number of successes? Can R do this for me. So instead I plug 
>> in 5, 50, and 95, and then get the small p out?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Small p from binomial probability function.

Reply via email to