On Tue, 13 Aug 2013, Cade, Brian wrote:
Lauria: For historical reasons the logistic regression (binomial with
logit link) model portion of a zero-inflated count model is usually
structured to predict the probability of the 0 counts rather than the
nonzero (>=1) counts so the coefficients will be the negative of what you
expect based on the count model portion (as in your output). It is simple
to interpret the probability of the logistic regression portion as the
probability of the nonzero counts by just taking the negative of the
coefficient estimates provided for the probability of the zero counts.
This is a common misinterpretation but not quite correct.
The zero-inflation model is a mixture model of two components: (1) a count
component (Poisson, NB, ...), and (2) a zero mass component (i.e., zero
with probability 1). Hence, the observed zeros in the data can come from
both sources: either they are "random" zeros from component (1) or
"excess" zeros from component (2).
The binomial zero-inflation part of the model predicts the probability
that a given observation belongs to component (1). Thus, the probability
of an "excess zero". But this is _not_ the probability of observing a zero
in the data (which is larger than the excess zero probability).
If you want a model that first models zero vs. non-zero and second the
non-zero counts, use the hurdle model. This has exactly the interpretation
you describe above.
Best,
Z
Brian
Brian S. Cade, PhD
U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO 80526-8818
email: ca...@usgs.gov <brian_c...@usgs.gov>
tel: 970 226-9326
On Tue, Aug 13, 2013 at 9:06 AM, Lauria, Valentina <
valentina.lau...@nuigalway.ie> wrote:
Dear All,
I am running a negative binomial model in R using the package pscl in oder
to estimate bed sediment movements versus river discharge. Currently we
have deployed 4 different plates to test if a combination of more than one
plate would better describe the sediment movements when the river discharge
changes over time.
My data are positively skewed and zero-inflated. I did run both
zero-inflated Poisson and zero-inflated negative binomial regression and
compared them using the VUONG test which showed that the negative binomial
works better than a simple zero-inflated Poisson.
My models look like:
1) plate1 ~ river discharge
2) (plate 1 + plate 2) ~ river discharge
3) (plate 1 + plate 2 +plate 3) ~ river discharge
4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge
My main problem as I am new to these type of models is that I get a
different sign for the coefficent of discharge in the output of the
zero-inflated negative binomial model (please see below). What does this
mean? Also how could I compare the different models (1-4) i.e. what tells
me which is performing best? Thank you very much in advance for any
comments and suggestions!!
Kind Regards,
Valentina
Call:
zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist =
"negbin", EM = TRUE)
Pearson residuals:
Min 1Q Median 3Q Max
-0.6770 -0.3564 -0.2101 -0.0814 12.3421
Count model coefficients (negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.557066 0.036593 69.88 <2e-16 ***
discharge 0.064698 0.001983 32.63 <2e-16 ***
Log(theta) -0.775736 0.012451 -62.30 <2e-16 ***
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 13.01011 0.22602 57.56 <2e-16 ***
discharge -1.64293 0.03092 -53.14 <2e-16 ***
Theta = 0.4604
Number of iterations in BFGS optimization: 1
Log-likelihood: -6.933e+04 on 5 Df
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.