Re: [R] Optimisation and NaN Errors using clm() and clmm()

Thomas Foxley Thu, 18 Apr 2013 10:33:54 -0700

Rune,

Thank you very much for your response.

I don't actually have the models that failed to converge from the first(glmulti) part as they were not saved with the confidence set. glmultigenerates thousands of models so it seems reasonable that a few of thesemay not converge.

The clmm() model I provided was just an example - not all models have 17parameters. There were only one or two that produced errors (the exampleI gave being one of them), perhaps overparameterisation is the root ofthe problem.

Regarding incomplete data - there are only 103 (of 314) records where Ihave data for every predictor. The number of observations included willobviously vary for different models, models with fewer predictors willinclude more observations. glmulti acts as a wrapper for anotherfunction, meaning (in this case) na's are treated as they would be inclm(). Is there a way around this (apart from filling in the missingdata)? I believe its possible to limit model complexity in the glmulticall - which may or may not increase the number of observations - howwould this affect interpretation of the results?


Thanks again,

Tom


On 16/04/13 07:54, Rune Haubo wrote:

On 15 April 2013 13:18, Thomas <thomasfox...@aol.com> wrote:

Dear List,

I am using both the clm() and clmm() functions from the R package 'ordinal'.

I am fitting an ordinal dependent variable with 5 categories to 9 continuous 
predictors, all of which have been normalised (mean subtracted then divided by 
standard deviation), using a probit link function. From this global model I am 
generating a confidence set of 200 models using clm() and the 'glmulti' R 
package. This produces these errors:

/> model.2.10 <- glmulti(as.factor(dependent) ~ 
predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9,
 data = database, fitfunc = clm, link = "probit", method = "g", crit = aicc, 
confsetsize = 200, marginality = TRUE)
...
After 670 generations:
Best model: 
as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6
Crit= 183.716706496392
Mean crit= 202.022138576506
Improvements in best and average IC have bebingo en below the specified goals.
Algorithm is declared to have converged.
Completed.
There were 24 warnings (use warnings() to see them)

warnings()

Warning messages:
1: optimization failed: step factor reduced below minimum
2: optimization failed: step factor reduced below minimum
3: optimization failed: step factor reduced below minimum/
etc.


I am then re-fitting each of the 200 models with the clmm() function, with 2 
random factors (family nested within order). I get this error in a few of the 
re-fitted models:

/> model.2.glmm.2 <- clmm(as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + 
predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + 
predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + 
predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + 
predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), 
link = "probit", data = database)

summary(model.2.glmm.2)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + 
predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 
+ predictor_7:predictor_2 +
predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + 
predictor_9:predictor_2 +
predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + 
predictor_9:predictor_8 + (1 | order/family)
data: database

link threshold nobs logLik AIC niter max.grad cond.H
probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03

Random effects:
Var Std.Dev
family:order 7.493e-11 8.656e-06
order 1.917e-12 1.385e-06
Number of groups: family:order 12, order 4

Coefficients:
Estimate Std. Error z value Pr(>|z|)
predictor_1 0.40802 0.78685 0.519 0.6041
predictor_2 0.02431 0.26570 0.092 0.9271
predictor_3 -0.84486 0.32056 -2.636 0.0084 **
predictor_6 0.65392 0.34348 1.904 0.0569 .
predictor_7 0.71730 0.29596 2.424 0.0154 *
predictor_8 -1.37692 0.75660 -1.820 0.0688 .
predictor_9 0.15642 0.28969 0.540 0.5892
predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 *
predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 ***
predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 *
predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 ***
predictor_1:predictor_9 4.28519 NA NA NA
predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 *
predictor_3:predictor_9 -1.49790 NA NA NA
predictor_6:predictor_9 -1.31538 NA NA NA
predictor_7:predictor_9 -4.41998 NA NA NA
predictor_8:predictor_9 3.99709 NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Threshold coefficients:
Estimate Std. Error z value
0|1 -0.2236 0.3072 -0.728
1|2 1.4229 0.3634 3.915
(211 observations deleted due to missingness)
Warning message:
In sqrt(diag(vc)[1:npar]) : NaNs produced/

This warning is due to a (near) singular variance-covariance matrix of
the model parameters, which in turn is due to the fact that the model
converged to a boundary solution: both random effects variance
parameters are zero. If you exclude the random terms and refit the
model with clm, the variance-covariance matrix will probably be well
defined and standard errors can be computed.

Another thing is that you are fitting 17 regression parameters and 2
random effect terms (which in the end do not count) to only 103
observations. I would be worried about overfitting or perhaps even
non-fitting. I think I would also be concerned about the 211
observations that are incomplete, and I would be careful with
automatic model selection/averaging etc. on incomplete data (though I
don't know how/if glmulti actually deals with that).

I have tried a number of different approaches, each has its own problems. I 
have fixed these using various suggestions from online forums (eg 
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015328.html, 
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q2/016165.html) and this 
is as good as I can get it.

After the first stage (generating the model set with glmulti) I tested every 
model in the confidence set individually - there were no errors - but there was 
clearly a problem during the model selection process. Should I be worried?

I don't know - I don't use glmulti or automatic model selection
regularly, so I don't know what the consequences might be.

The question seems to be what caused the potential non-convergences
for some of the models that were not chosen. If they didn't converge
because the models are not identifiable, then I suppose all is ok, but
if they are relevant models that should have converged, then there
might be a problem. However, if a model does not converge, there is
usually a good reason for it, so I am not particularly worried that
there are relevant models among those that did not converge. Without
considering a particular model, it is hard to tell why it might not
have converged, but if you can pinpoint the models that trigger the
warnings/errors, I would be happy to take further look at them.

Hope this helps,
Rune

No errors appear in the top 5% of re-fitted models (which are the only ones I 
will be using) however I am concerned that errors may be indicative of a 
problem with my approach.

A further worry is that the errors might be removing models that could 
otherwise be included.


Any help would be much appreciated.

Tom

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimisation and NaN Errors using clm() and clmm()

Reply via email to