Re: [R] using near-zero probabilities in optimization

Benedikt Gehr, Institut für Evolutionsbiologie und Umweltwissenschaften Wed, 10 Mar 2010 01:20:21 -0800

Dear Ben

I haven't looked at the profiles yet, I'm not sure how to.

I'm doing population reconstruction from age-at-harvest data. I'm using amultinomial distribution to model the observed harvest numbers.But I didn't know about the Dirichlet-Multinomial so far. I think that mightreally be a good option because overdispersion is likeliy to be an issuehere.Below I have copied my code for the simulation of data (the first exampleworks, the other gives poor fit) and subsequent optimization. I'm sorry thecode is very long, but I didn't know how to short it down.


The problem:

Q=harvest rate -> if harvest rates are too small the model fits the datavery bad.


Here is the R code:

library(boot)
library(bbmle)
library(demogR)
library(msm)

##########################################
##########################################
##Stochastic model -> DATA SIMULATION
##########################################

a<-8                                                                         # 
nr of age classes
y<-30                                                                           
     # nr of years

m<-c(rep(0.2,2),rep(0.75,4),0.5,0.2) # productivity -> nr ofyoung/female

m.time<-matrix(rtnorm(a*y,m,0.1,lower=0),ncol=a,byrow=T)


############################################################
##TWO VERSIONS OF Q=HARVEST RATE->Q1 WORKS, Q2 GIVES BAD FIT
############################################################

Q<-c(rep(0.2,2),rep(0.3,3),0.4,0.6,0.9) #WORKS # true mean harvestmortality ratesQ<-c(rep(0.02,2),rep(0.03,3),0.4,0.6,0.9) #BAD FIT # true mean harvestmortality rates

Q.time<-matrix(rtnorm(a*y,Q,0.06,lower=0,upper=1),ncol=a,byrow=T) # addingrandom noise

#############################################################
#############################################################


P<-c(0.5,rep(0.9,4),rep(0.8,2),0.7)                                  # natural 
survival rates

P.time<-matrix(rtnorm(a*y,P,0.06,lower=0,upper=1),ncol=a,byrow=T) # addingrandom noise

L.time<-(1-Q.time[,1:(a-1)])*P.time[,2:a] # Overall survival -> Derivedparameter


F.time<-m.time*P.time[,1]                                            # Fecundity   
    -> Derived parameter

Lm.time<-array(0,c(a,a,y)) # Lesli-matrix-array -> for each year a newmatrix

for (i in 1:y){
Lm.time[,,i]<-odiag(L.time[i,],-1)                                   # survival 
in the off-diagonal
}

for (i in 1:y){                                                         # 
fecundity in the first row
Lm.time[1,,i]<-F.time[i,]
}

Hm.time<-array(0,c(a,a,y)) # Harvest-matrix-array -> for each year anew matrix

for (i in 1:y){
Hm.time[,,i]<-odiag(Q.time[i,])                                      # 
mortality rates in the diagonal
}

N1<-rep(200,a)                                                               # 
Define the starting vector of cohort sizes
N.true<-matrix(rep(0,a*y),ncol=a)                                    # True 
abundance matrix
N.true[1,]<-N1
C<-matrix(rep(0,a*y),ncol=a)                                         # Harvest 
matrix

for (i in 1:(y-1)){
N.true[i+1,]<-Lm.time[,,i]%*%N.true[i,]
}
N.true<-round(N.true)                                                        # 
round the counts of the true population

for (i in 1:y){
C[i,]<-Hm.time[,,i]%*%N.true[i,]
}
C<-round(C)                                                                  
#round the counts of the harvest numbers

plot(rowSums(N.true)~seq(1:y),ylim=c(0,4000),col="red") # Plot the timeseries of the simulated datalines(rowSums(N.true)~seq(1:y),col="red") # true population and trueharvest numbers

points(rowSums(C)~seq(1:y),ylim=c(0,4000),col="blue")
lines(rowSums(C)~seq(1:y),col="blue")


###########################################################################
###########################################################################

############################
##ANALYSIS OF SIMULATED DATA
############################

##Simulated harvest numbers of a cohort harvested over 30 years (includingrandom noise)

pop<-N.true                                                  # True population 
size
cohort<-C                                                    # harvest matrix
R<-nrow(cohort)
L<-ncol(cohort)
        
h<-matrix(rep(Q,R),ncol=a,byrow=T)   # estimated harvest rates
S<-matrix(rep(P[-1],R),ncol=a-1,byrow=T)     # survival rates

##############################################
##Function to calculate the cell probabilities
##############################################

predprob <- function(S=S,h=h) {
  R<-nrow(cohort)
  L<-ncol(cohort)
  p <- matrix(rep(0,(R)*(L)),ncol=L)         # matrix of cell probabilities
        p[R,1]<-h[R,1]
        p[1,L]<-h[1,L]

ay<-rep(0,L+R+1) # vector of last cell probabilities ->1-sum(rest));Gove(2002)

        ay[L+R]<-1-h[1,L]
        ay[2]<-1-h[R,1]

index<-3 # ay[1]=Null, ay[2]=1-h[R,1],ay[L+R]=1-h[1,L],ay[L+R+1]=Null



for (i in -(R-2):(L-2)){                                # Calculating the cell 
prob after Gove (2002)

p[row(p) == col(p) - i] <-odiag(h,i)*c(1,cumprod((1-odiag(h[1:(R-1),1:(L-1)],i))

                                                *odiag(S[1:(R-1),],i)))
        
        ay[index]<-1-sum(odiag(p[1:R,],i))
        
        index<-index+1
   }

        p<-rbind(p,ay[1:L])
        p<-cbind(p,ay[length(ay):(L+1)])
return(p)

 }

predprob(S,h)                                           #test the function

####################################
##Function to calculate the multinomial ->from Ben Bolker
####################################

## multinom WITHOUT ROUNDING & without error-checking
dmnom <- function(x,prob,log=FALSE) {
  r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
  if (log) r else exp(r)
}

###########################
## function to be optimized
###########################

mfun <- function(logN,S=S,h=h,cohort=cohort) {
  L<-ncol(cohort)
  R<-nrow(cohort)

d<-matrix(rep(NA,(R+1)*(L+1)),ncol=L+1) # last row and last col of thematrixd[1:R,1:L]<-log(cohort) # are the values to be optimized->Logtranformed data!


  index<-1
  index1<-(R+L)

for (i in -(R-1):(L-1)){                                # values for the top row
if (i<= -(R-L) ){
d[(R+1),(index+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))} else
if (i> -(R-L)& i<1 ){
d[index1,(L+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))} else
if (i>0){
d[index1,(L+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))}

index<-index+1
index1<-index1-1
}

pp<-predprob(S,h)


lhood<-0                                                     # The Likelihood 
function -> to be maximized

for (i in -(R-1):(L-1)){
 lhood<-lhood+(-dmnom(exp(odiag(d,i)),prob=odiag(pp,i),log=TRUE))
}
return(lhood)
}

logN<-log(rep(1000,(R+L-1)))                         # Test the function with 
test-values
mfun(logN,S,h,cohort)

##########################################################################################
##########################################################################################

######################################
##Starting values for the optimization
######################################

N<-rep(500,(R+L-1))                                  # optimization,->N-x1-x2-x3
svec = log(N)                                           # transform to avoid 
constraints (x>0)
names(svec) <- paste("logN",1:(R+L-1),sep="")
parnames(mfun) <- names(svec)

######################################
##Lower bounds for the L-BFGS-B-method
######################################

index<-1
lower<-rep(0,(R+L-1))

for (i in -(R-1):(L-1)){
lower[index]<-log(sum(odiag(cohort,i)))
index<-index+1
}

lower<-lower+0.001
names(lower)<-names(svec)
exp(lower)                                                      # check lower 
bounds

#########################
##Optimization with mle2
#########################

## use bounded optimization

m1 = mle2(mfun,start=svec,

method="L-BFGS-B",lower=lower,upper=10, # lower=must be larger than"sum(odiag(x))"<-if smaller NANs are produced

  vecpar=TRUE,
  data=list(S=S,h=h,cohort=cohort))

coef(m1)

Nest<-exp(coef(m1))                                  # estimated cohort sizes 
for row and colum 1

############################################################################################
############################################################################################

#####################
##Analyse the Goodness of fit
#####################

Nya<-matrix(rep(0,length(cohort)),ncol=L) # project the N estimates into amatrix

Nya[1,1:L]<-Nest[R:length(Nest)]
Nya[2:R,1]<-rev(Nest[1:(R-1)])

stot<-(1-Q[1:(a-1)])*P[2:a] # derived survival -> product of (1-h(i)) amds(i)

smat<-matrix(rep(stot,R),ncol=L-1,byrow=T)

for (i in 1:ncol(smat)){                                # project the N 
estimates trough time
Nya[2:R,i+1]<-Nya[1:(R-1),i]*smat[1:(R-1),i]
}


#######################
##Expected harvest values compared with observed harvest numbers
#######################

exp<-Nya*h
obs<-cohort

gof<-((obs-exp)^2)/exp # Goodness of fit for the singleestimates # Overall GOF statistic

gof.stat<-sum(gof)                                   # GOF statistic

p.value<-1-pchisq(gof.stat,(length(obs)-1)) # p-value with appropriatedegrees of freedom

p.value

plot(obs~exp,main="Observed harvest numbers vs expected from themodel",xlim=c(0,200),ylim=c(0,200))

abline(0,1)

##################
##Plots
##################
##Plot true population trend <-> compare with estimated trend

plot(rowSums(N.true)~seq(1:y),ylim=c(0,5000),col="red",pch=17,main="True vsestimated population trend")

lines(rowSums(N.true)~seq(1:y),col="red")

points(rowSums(Nya)~seq(1:y),ylim=c(0,5000),col="blue",pch=16)
lines(rowSums(Nya)~seq(1:y),col="blue")

###############################################################
###############################################################
##########
##END#####
##########

Thanks a lot for the help!!!!

Benedikt



On Tue, 9 Mar 2010 15:34:09 +0000 (UTC)
 Ben Bolker <bol...@ufl.edu> wrote:

Benedikt Gehr <benedikt.gehr <at> ieu.uzh.ch> writes:
Hi there
I am using mle2 for a multinomial likelihood optimization problem. Myfunction works fine when I'm using simulated data, however my cellprobabilities of the true data for the multinomial likelihood aresometimes very small (in some cases <0.00...) and the estimated pointestimates fit the true vlaues quite poorly. Is there a way how to handlenear zero probabilities in maximum likelihood optimization?
 Hard to say without more detail.  Can you send a reproducible
example (your data, or a small subset of your data, or
some way of simulating the data that *does* create the problem)?

 Since you're using log-likelihoods already (within mle2) the
problem is unlikely (?) to be numerical -- R doesn't have any
problem with very large negative log-likelihoods.  Do your
likelihood profiles look reasonable?  If so then the problem
is more likely that your model doesn't fit the data well than
that you are having convergence problems.

 Have you considered the possibility of overdispersion
(non-homogeneity/non-independence), e.g. via a Dirichlet-multinomial
model?

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using near-zero probabilities in optimization

Reply via email to