Hello, # Full disclosure. I am not sure if my problem is a bug(s) in the code, or a fundamental misunderstanding on my part about what I am trying to do with these statistics. I am not familiar with maximum likelihood tests.
# I currently have two vectors Aequipecten<-c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) Samples<-c(0.03333333, 0.06666667, 0.25000000, 0.27222222, 0.27777778, 0.28888889, 0.29444444, 0.30000000, 0.31666667, 0.32777778, 0.33333333, 0.33888889, 0.35000000, 0.35555556, 0.36111111, 0.37777778, 0.38333333, 0.38888889, 0.40000000, 0.40555556, 0.41111111, 0.42222222, 0.43333333, 0.44444444, 0.45000000, 0.46111111, 0.46666667, 0.47222222, 0.47777778, 0.49444444, 0.52222222, 0.53888889, 0.55000000, 0.55555556, 0.56111111, 0.56666667, 0.57222222, 0.57777778, 0.58333333, 0.58888889, 0.60000000, 0.60555556, 0.61111111, 0.61666667, 0.62222222, 0.62777778, 0.63333333, 0.63888889, 0.64444444, 0.65000000, 0.65555556, 0.66111111, 0.66666667, 0.67222222, 0.67777778, 0.68333333, 0.68888889, 0.69444444, 0.70000000, 0.70555556, 0.71111111, 0.71666667, 0.72222222, 0.72777778, 0.73333333, 0.73888889, 0.74444444, 0.75000000, 0.75555556, 0.76111111, 0.76666667, 0.77222222, 0.77777778, 0.78333333, 0.78888889, 0.79444444, 0.80000000, 0.81111111, 0.81666667, 0.83888889, 0.85555556, 0.86666667, 0.88333333, 0.88888889, 0.89444444, 0.94444444, 0.95555556) # What I want to do is a log-likelihood model selection process where i check if a Gaussian fit or a skewed unimodal fit (among others) is better. # The first thing I do is use this function written by Oksanen http://cc.oulu.fi/~jarioksa/softhelp/hof3.pdf HOF.start<-function(y,x,model=5) { if (model >=4) { k<-glm(y ~ x + I(x^2),family=binomial) k1<-k$coefficients[1] k2<-k$coefficients[2] k3<-k$coefficients[3] u<-(-k2/2/k3) h<-plogis(k1-k2^2/4/k3) r<-log(1/h*(-2*h+2*sqrt(h))/2) b<-5.07-0.227*k3 a<-(-b*u+r) c<-b*u+4 } else { k<-glm(y ~ x,family=binomial) k1<-k$coefficients[1] k2<-k$coefficients[2] a<-(-k1) b<-(-k2) c<-0 } switch(model, p<-c(a=a), p<-c(a=a,b=b), p<-c(a=a,b=b,c=c), p<-c(a=a,b=b,c=c), p<-c(a=a,b=b,c=c,d=b), ) p } # What I am concerned about here is option (model=4). >p<-HOF.start(Aequipecten,Samples,model=4) >p a.I(x^2) b.I(x^2) c.I(x^2) -6.520338 10.330863 10.547157 # I am not sure if this function is working correctly (but see below for my alternative way of deriving equivalent p's). The goal is to use these values "p" as a starting point for the non-linear maximization. # Next, I plug those values (p) into the non-linear maximization which uses the following functions: HOF.fun<-function(p,model,x,M=1) { p0<-min(x) ra<-max(x)-p0 x<-(x-p0)/ra switch(model, fv<-rep(M/(1+exp(p[1])),length(x)), fv<-M/(1+exp(p[1]+p[2]*x)), fv<-M/(1+exp(p[1]+p[2]*x))/(1+exp(p[3])), fv<-M/(1+exp(p[1]+p[2]*x))/(1+exp(p[3]-p[2]*x)), fv<-M/(1+exp(p[1]+p[2]*x))/(1+exp(p[3]-p[4]*x)) ) fv } HOF<-function(x,y,p,model,M=1) { fv<-HOF.fun(p,model,x,M=M) sum(-dbinom(y,M,fv,log=TRUE)) } # My understanding of how this is supposed to work is that the final line of the HOF function: "sum(-dbinom(y,M,fv,log=TRUE))" is supposed to give the negative log-odds of that function being a good fit. > HOF(p,x=Sample,y=Aequipecten,model=4) [1] 63.38763 # Notice, however, that it returns a positive value, when it is supposed to be negative. What does this mean? What am I doing wrong? #Furthermore, if I attempt a non-linear maximization on this function, I continue to get a positive negative log-odds and what I think are unreasonable looking parameter estimates. What does this mean? >sol<-nlm(HOF,p,x=Sample,y=Aequipecten,model=4) > sol $minimum [1] 32.72701 $estimate [1] -8.751336 12.687632 8.281556 $gradient [1] -4.640475e-07 -5.290583e-07 5.618925e-08 $code [1] 1 $iterations [1] 16 # If I compare these results to simply doing a straight gaussian logistic regression on the vectors I get quite a different result. I'm very confident that this equation is working properly. I would expect the outputs of this function to be roughly equivalent to the results from object "p" as seen above, but they don't look similar at all. I'm not sure what this means. tBLparamsForOneSpecies <- function(theData) { x <- theData[ ,1] y <- theData[ ,2] logitReg <- glm(y ~ x + I(x^2), family=binomial) b0 <- logitReg$coefficients[1] b1 <- logitReg$coefficients[2] b2 <- logitReg$coefficients[3] opt <- (-b1)/(2*b2) tol <- 1 / sqrt(-2*b2) pmax<-1/(1+exp(b1^2/(4*b2)-b0)) theParams <- cbind(opt, tol, pmax) theParams } >theData<-cbind(Sample,Aequipecten) >tBLparamsForOneSpecies(theData) > tBLparamsForOneSpecies(theData) opt tol pmax x 0.6337473 0.1468823 0.2433407 #Nor do to the coefficients look similar to p either. > glm(Aequipecten~Sample+I(Sample^2),family=binomial) Call: glm(formula = Aequipecten ~ Sample + I(Sample^2), family = binomial) Coefficients: (Intercept) Sample I(Sample^2) -10.44 29.37 -23.18 Degrees of Freedom: 86 Total (i.e. Null); 84 Residual Null Deviance: 73.38 Residual Deviance: 68.07 AIC: 74.07 # So anyway, to sum up my problem, I'm concerned that the results of my function HOF and of my non-linear maximization of HOF keep giving me a positive log-odds. I'm not sure if this is a result of a bug in the code somewhere, or if I actually just don't understand what I'm doing in terms of the statistics. I've tried to provide as much information as possible, but I really don't know where or what the problem is. Thank you for any help, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Non-linear-maximization-function-in-R-tp3916240p3916240.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.