On Thu, 2010-06-03 at 17:00 +0200, Joris Meys wrote: > On Thu, Jun 3, 2010 at 9:27 AM, Gavin Simpson <gavin.simp...@ucl.ac.uk>wrote: > > > > > vegan is probably not too useful here as the response is univariate; > > counts of ducks. > > > > If we assume that only one species is counted and of interest for the whole > research. I (probably wrongly) assumed that data for multiple species was > available. > > Without knowledge about the whole research setup it is difficult to say > which method is the best, or even which methods are appropriate. VGAM is > indeed a powerful tool, but : > > > proportion_non_zero <- (sum(ifelse(data$duck == 0,0,1))/182) > means 182 observations in the dataset > > > model_NBI <- gamlss(duck ~ cs(HHCDI200,df=3) + cs(HHCDI1000,df=3) + > cs(HHHDI200,df=3) + cs(HHHDI1000,df=3) + cs(LFAP200,df=3),data=data, > family= NBI) > > is 5 splines with 3df, an intercept, that's a lot of df for only 182 > observations. using VGAM ain't going to help here.
How do you know? > I'd reckon that the model > itself should be reconsidered, rather than the distribution used to fit the > error terms. I was going to mention that too, but the OP did reduce this down to a single smooth and the problem of increasing deviance remained. Hence trying to fit a /similar/ model in other software might give an indication whether the problems are restricted to a single software or a more general issue of the data/problem? At this stage the OP is stuck not knowing what is wrong; (s)he has nothing to do model checking on etc. Trying zeroinfl() and fitting a parametric model, for example, might be a useful starting point, then move on to models with smoothers if required. G > > Cheers > Joris > > > > > > You could also use the hurdle model from the pscl library, but this is > > not > > > guaranteed to work better. It is not a zero-inflation model, but a > > > two-component model that fits the zeros using eg a binomial and the rest > > of > > > the counts using eg a poisson or a negative binomial. > > > > zeroinfl() in the same (pscl) package will fit ZIP and ZINB models. > > Neither of these will be a GAM though, so if the OP was wanting smooth > > functions of covariates they may not so useful, although bs() in package > > splines might be used to include smooth terms into these functions. > > > > gam() in mgcv has the negative binomial family that might be a useful > > starting point to modelling the overdispersion. > > > > Package vgam has functions to fit ZIP/ZINB or the hurdle models > > mentioned above with smooth functions of covariates. > > > > In any case the OP should probably seek help directly from the gamlss > > package maintainers to get gamlss working on their data or suggest why > > the fitting isn't working. > > > > HTH > > > > G > > > > > > > > Cheers > > > Joris > > > > > > On Wed, Jun 2, 2010 at 12:56 PM, Strubbe Diederik < > > diederik.stru...@ua.ac.be > > > > wrote: > > > > > > > Dear all, > > > > > > > > I am using gamlss (Package gamlss version 4.0-0, R version 2.10.1, > > Windows > > > > XP Service Pack 3 on a HP EliteBook) to relate bird counts to habit > > > > variables. However, most models fail because the global deviance is > > > > increasing and I am not sure what causes this behaviour. The dataset > > > > consists of counts of birds (duck) and 5 habit variables measured in > > the > > > > field (n= 182). The dependent variable (the number of ducks > > > > counted)suffers from zero-inflation and overdisperion: > > > > > > > > > proportion_non_zero <- (sum(ifelse(data$duck == 0,0,1))/182) > > > > > mean <- mean(data$duck) > > > > > var <- var(data$duck) > > > > > proportion_non_zero > > > > [1] 0.1153846 > > > > > mean > > > > [1] 1.906593 > > > > > var > > > > [1] 37.35587 > > > > > > > > (I have no idea how to simulate a zero-inflated overdispersed Poisson > > > > variable, but the data used can be found at > > > > http://www.ua.ac.be/main.aspx?c=diederik.strubbe&n=23519). > > > > > > > > > > > > First, I create a (strong) pattern in the dataset by: > > > > data$LFAP200 <- data$LFAP200 + (data$duck*data$duck) > > > > > > > > I try to analyze these data by fitting several possible distributions > > > > (Poisson PO, zero-inflated Poisson ZIP, negative binomial type I and > > type II > > > > NBI NBII and zero-inflated negative binomial ZINBI) while using cubic > > > > splines with a df=3. The best fitting model will then be choses on the > > basis > > > > of its AIC. > > > > > > > > However, these models frequently fail to converge, and I am not sure > > why, > > > > and what to do about it. For example: > > > > > > > > > model_Poisson <- gamlss(duck ~ cs(HHCDI200,df=3) + cs(HHCDI1000,df=3) > > + > > > > cs(HHHDI200,df=3) + cs(HHHDI1000,df=3) + > > cs(LFAP200,df=3),data=data,family= > > > > PO) > > > > GAMLSS-RS iteration 1: Global Deviance = 1350.623 > > > > GAMLSS-RS iteration 2: Global Deviance = 1350.623 > > > > > > > > > model_ZIPoisson <- gamlss(duck ~ cs(HHCDI200,df=3) + > > cs(HHCDI1000,df=3) + > > > > cs(HHHDI200,df=3) + cs(HHHDI1000,df=3) + > > cs(LFAP200,df=3),data=data,family= > > > > ZIP) > > > > GAMLSS-RS iteration 1: Global Deviance = 326.3819 > > > > GAMLSS-RS iteration 2: Global Deviance = 225.1232 > > > > GAMLSS-RS iteration 3: Global Deviance = 319.9663 > > > > Error in RS() : The global deviance is increasing > > > > Try different steps for the parameters or the model maybe > > inappropriate > > > > In addition: There were 14 warnings (use warnings() to see them) > > > > > > > > > model_NBI <- gamlss(duck ~ cs(HHCDI200,df=3) + cs(HHCDI1000,df=3) + > > > > cs(HHHDI200,df=3) + cs(HHHDI1000,df=3) + > > cs(LFAP200,df=3),data=data,family= > > > > NBI) > > > > GAMLSS-RS iteration 1: Global Deviance = 291.8607 > > > > GAMLSS-RS iteration 2: Global Deviance = 291.3291 > > > > ######......###### > > > > GAMLSS-RS iteration 4: Global Deviance = 291.1135 > > > > GAMLSS-RS iteration 20: Global Deviance = 291.107 > > > > Warning message: > > > > In RS() : Algorithm RS has not yet converged > > > > > > > > > model_NBII <- gamlss(duck ~ cs(HHCDI200,df=3) + cs(HHCDI1000,df=3) + > > > > cs(HHHDI200,df=3) + cs(HHHDI1000,df=3) + > > cs(LFAP200,df=3),data=data,family= > > > > NBII) > > > > GAMLSS-RS iteration 1: Global Deviance = 284.5993 > > > > GAMLSS-RS iteration 2: Global Deviance = 281.9548 > > > > ######......###### > > > > GAMLSS-RS iteration 5: Global Deviance = 280.7311 > > > > GAMLSS-RS iteration 15: Global Deviance = 280.6343 > > > > > > > > > model_ZINBI <- gamlss(duck ~ cs(HHCDI200,df=3) + cs(HHCDI1000,df=3) + > > > > cs(HHHDI200,df=3) + cs(HHHDI1000,df=3) + > > cs(LFAP200,df=3),data=data,family= > > > > ZINBI) > > > > GAMLSS-RS iteration 1: Global Deviance = 1672.234 > > > > GAMLSS-RS iteration 2: Global Deviance = 544.742 > > > > GAMLSS-RS iteration 3: Global Deviance = 598.9939 > > > > Error in RS() : The global deviance is increasing > > > > Try different steps for the parameters or the model maybe > > inappropriate > > > > > > > > > > > > Thus, in this case, only the Poisson (PO) and Negative Binomial type I > > > > (NBI)converge whereas all other models fail > > > > > > > > My first approach was to omit the smoothing factors for each model, or > > > > further reduce the number of variables but this does not solve the > > problem > > > > and most models fail, often yielding a Error in RS() : The global > > deviance > > > > is increasing message. > > > > > > > > I would think that, given the fact that the dependent variable is > > > > zero-inflated and overdispersed, that the Zero-Inflated Negative > > Binomial > > > > (ZINBI) distribution would be the best fit, but the ZINBI even fails in > > the > > > > following very simple examples. > > > > > > > > > model_ZINBI <- gamlss(duck ~ cs(LFAP200,df=3),data=data,family= > > ZINBI) > > > > GAMLSS-RS iteration 1: Global Deviance = 3508.533 > > > > GAMLSS-RS iteration 2: Global Deviance = 1117.121 > > > > GAMLSS-RS iteration 3: Global Deviance = 652.5771 > > > > GAMLSS-RS iteration 4: Global Deviance = 632.8885 > > > > GAMLSS-RS iteration 5: Global Deviance = 645.1169 > > > > Error in RS() : The global deviance is increasing > > > > Try different steps for the parameters or the model maybe > > inappropriate > > > > > > > > > model_ZINBI <- gamlss(duck ~ LFAP200,data=data,family= ZINBI) > > > > GAMLSS-RS iteration 1: Global Deviance = 3831.864 > > > > GAMLSS-RS iteration 2: Global Deviance = 1174.605 > > > > GAMLSS-RS iteration 3: Global Deviance = 562.5428 > > > > GAMLSS-RS iteration 4: Global Deviance = 344.0637 > > > > GAMLSS-RS iteration 5: Global Deviance = 1779.018 > > > > Error in RS() : The global deviance is increasing > > > > Try different steps for the parameters or the model maybe > > inappropriate > > > > > > > > > > > > > > > > Any suggestions on how to proceed with this? > > > > > > > > Many thanks in advance, > > > > > > > > > > > > Diederik > > > > > > > > > > > > Diederik Strubbe > > > > Evolutionary Ecology Group > > > > Department of Biology > > > > University of Antwerp > > > > Groenenborgerlaan 171 > > > > 2020 Antwerpen, Belgium > > > > tel: +32 3 265 3464 > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > > Gower Street, London [w] > > http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> > > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > > > -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.