I've got a data set of about 5,000 observations with a zero-inflated count 
response variable, two predictor variables and a variable which is effectively 
an area of opportunity, which I want to use as an offset term (all continuous). 
 I want to explore the association between these variables, in particular I'm 
interested in the association of one particular variable net of the others and 
it would be useful when presenting the results of the analysis to hold my other 
variables constant at say their means or medians, and show what my model 
predicts would happen to my RV as it varied.

In reality there are a bunch of methods issues with this that I don't want to 
explore in this post, (like would a hurdle model be better, and as it is a 
model with spatial variables, should I use something optimised for this, 
bootstrapping the CIs etc) as I want to get to an adequate level of 
understanding of how this particular model behaves before I go any further with 
this.

I have incorporated the code to synthesise some data that look a bit like mine, 
I've then gone about making some predictions the best way I can given my 
current level of understanding of the package I've used (pscl), with some brief 
annotation.

#packages required
require(pscl)
require(emdbook)
#get some data
set.seed(33)
ds<data.frame(a=rzinbinom(500,mu=2,zprob=0.6,size=0.8),b=runif(500,3,70),c=runif(500,0.3,10),d=runif(500,0.03,0.8))
#run the model
mymodel<-zeroinfl(a~b+c,offset=d,data=ds,dist="negbin",EM=TRUE)
summary(mymodel)
#make a prediction data set, holding c and d at their means and varying b over 
the range of original values
mypred<-data.frame(b=3:70,c=5.18,d=0.39)
#generate predicted values of response variable, add to the data frame
pred_rv<-predict.zeroinfl(mymodel,newdata=mypred,type="response",model="count",se=TRUE)
#assemble  the prediction data.frame
predfinal<-data.frame(mypred,pred_rv)
colnames(predfinal)<-c("b","c","d","predicted","lower","upper","se")
predfinal

My questions are:  What exactly does this give me? I am trying to get counts in 
their original scale that my model predicts.  Is what I have here the modelled 
counts from just the count component of the model? If I change the model term 
to "zero" from "count" I get the same result, why is this?  Or indeed have i 
simply completely gone the wrong way about getting these estimates?

Also, I can't seem to manually replicate any of these predicted values using 
the model results.

I've tried looking at the help page of the predict.zeroinfl and predict 
commands, but still cannot get much further with this.

Any help is much appreciated.

Regards

GavinR


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to