Dear all,
I've tried to solve the Es. 12, cap 4 of "Introduction to GLM" by Annette
Dobson.
It's about the relationship between survival time of leukemia patients and
blood cell count.
I tried to fit a model with exponential distribution, first by glm (family
gamma and then dispersion parameter fixed to 1) and
then with survreg.
They gave me the same point estimates but the standard errors are slightly
different.
I checked the results building the routine manually, and it seems that the glm
results are the correct one.
#######################################################
data <- data.frame(y=c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,65),
x=c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,
4.54,5.00,5.00,4.72,5.00))
model1 <- glm(y~x,family=Gamma(link="log"),data)
summary(model1,dispersion=1)
model2 <- survreg(Surv(y) ~ x, data, dist="exponential")
summary(model2)
X <- model.matrix(model1)
y <- as.vector(data$y)
b <- as.matrix(c(1,1)) # STARTING VALUES
iter <- matrix(0,7,2)
iter[1,] <- b
for (i in 2:7) {
W <- diag(rep(1,length(y)))
z <- X%*%b + (y-exp(X%*%b))*1/exp(X%*%b)
b <- solve(t(X)%*%W%*%X) %*% (t(X)%*%W%*%z)
iter[i,] <- b
}
summary(model1,dispersion=1)$coef
summary(model2)
iter[nrow(temp),]
sqrt(diag(solve(t(X)%*%W%*%X)))
#######################################################
Can you explain if this difference is due to an error?
Thanks in advance
Antonio Gasparrini
Public and Environmental Health Research Unit (PEHRU)
London School of Hygiene & Tropical Medicine
Keppel Street, London WC1E 7HT, UK
Office: 0044 (0)20 79272406 - Mobile: 0044 (0)79 64925523
http://www.lshtm.ac.uk/people/gasparrini.antonio (
http://www.lshtm.ac.uk/pehru/ )
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.