Hi - I would like to post to the R-help mailing list.  Here is my post:

-This is more of a general question about how the predict function treats 
categorical variables and how to interpret the output from predict.

I have a zeroinfl model to predict the number of animals encountered:

  b9<-zeroinfl(Count ~ as.factor (Area)  + as.factor(Season)|1, 
dist="negbin",data = total)

where Count is the number of animals and the explanatory variables are Area and 
Season, both are coded as factors in the model.  Area has three levels and 
Season has 4 levels.  Coding the two as factors allows for R to create dummy 
variables for each variable for use in the model.

When I use the predict function to predict the number of animals for a larger 
data set, I want to make sure I'm understanding what is happening.

My newdata for predict is:
newdata<-  as.data.frame(Season, Area)
Both variables are coded as factors and the dataframe is in a long format.  
There are records for each combination of Season and Area that correspond to 
trips taken over the course of 8 years.  There are 113,804 rows of data in the 
newdata data,frame.

str(newdata)
'data.frame':   113804 obs. of  2 variables:
$ Season: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
$ Area  : Factor w/ 3 levels "625","631","Bay": 3 3 3 3 3 3 3 3 3 3 ...

Example:
   Season Area
1       1  Bay
2       1  625
3       1  631
4       2  Bay
5       2  625
6       2  631
7       3  Bay
8       3  625
9       3  631
10      4  Bay
11      4  625
12      4  631

1.  Do I need to create dummy variables for all levels for the two variables 
for input into predict, or does predict function act like zeroinfl where if the 
variables are coded as factors, this is done automatically by R.

2.  Predict returns values of 0.0461 - 0.6015.  If I am trying to predict the 
number of animals how do I interpret this?  Since no predicted values are 
greater than 1 and I need whole numbers, I rounded the predicted data so that 
any value less than 0.5 was equal to 0 and any value greater than 0.5 was equal 
to 1.  Does this seem correct?

Thanks for any help.

Sally Roman
Fisheries Management Specialist
Virginia Marine Resources Commission
2600 Washington Avenue, 3rd Floor
Newport News, VA  23607
Phone: 757-247-2243


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to