Problems with predict and lines in plotting binomial glm
Dear R-helpers

I have found quite a lot of tips on how to work with glm through this mailing 
list, but still have a problem that I can't solve.
I have got a data set of which the x-variable is count data and the y-variable 
is proportional data, and I want to know what the relationship between the 
variables are.
The data was overdispersed (the residual deviance is much larger than the 
residual degrees of freedom) therefore I am using the quasibinomial family, 
thus the y-variable is a matrix of successes and failures (20 trials for every 
sample, thus each y-variable row counts up to 20).

x <- c(1200, 1200, 1200, 1200, 1200, 1200, 1200, 1200, 1800, 1800, 1800, 1800, 
1800, 1800, 1800, 1800, 1800, 2400, 2400, 2400, 2400, 2400, 2400, 2400, 3000, 
3000, 3600, 3600, 3600, 3600, 4200, 4200, 4800, 4800, 5400, 6600, 6600, 7200, 
7800, 7800, 8400, 8400, 8400, 9000, 9600, 10200, 13200, 18000, 20400, 24000, 
25200, 36600)
successes <- c(6, 16, 11, 14, 11, 16, 13, 13, 14, 16, 12, 12, 11, 15, 12, 9, 7, 
7, 17, 15, 13, 9, 9, 12, 14, 8, 9, 16, 7, 9, 14, 11, 8, 8, 13, 6, 16, 11, 9, 7, 
9, 8, 4, 14, 7, 3, 3, 9, 12, 8, 4, 6)
failures <- c(14, 4, 9, 6, 9, 4, 7, 7, 6, 4, 8, 8, 9, 5, 8, 11, 13, 13, 3, 5, 
7, 11, 11, 8, 6, 12, 11, 4, 13, 11, 6, 9, 12, 12, 7, 14, 4, 9, 11, 13, 11, 12, 
16, 6, 13, 17, 17, 11, 8, 12, 16, 14)
y <- cbind(successes, failures)
data <- data.frame(y, x)

glm1 <- glm(y ~ x, family= quasibinomial, data= data)
glm2 <- glm(y ~ log(x), family=quasibinomial, data= data)     # residual 
deviance is lower with log transformed x-value
plot(x, successes)
lines(x, predict(glm1, type= "response"), lwd=2)




Firstly, because of the skewed distribution of the x variable I am not sure 
whether it should be log transformed or not.  When I do log transform it, the 
residual deviance and the p-value for the slope is lower.
Either way, the lines command does not plot any line and neither does it give 
any error messages.  On some of my other data it plots a line way below all the 
data points.  From what I can gather, the predict function as it is now uses 
the fitted values because no newdata argument is specified. I want the line to 
be predicted from the same x-values.  I tried two ways of adding the newdata 
argument:

## a data.frame using the original x-values
lines(x, predict(glm2, type= "response", newdata= as.data.frame(x)))
##  or a data.frame with values (the same length as y) from the range of x 
values
newdf <- data.frame(seq(min(x), max(x), length=52))
lines(x, predict(glm2, type="response", newdata= newdf))

Only the second option plotted a line once, but then I could never get it to do 
the same again on a new plot even though I used the same variables and same 
code.


Thank you very much for your time and patient
Anina Heystek

BSc Honours student
Department of Botany and Zoology
University of Stellenbosch, Stellenbosch, South Africa
15418...@sun.ac.za



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to