Re: [R] fitting in logistic model

Marc Schwartz Tue, 30 Jun 2009 07:59:45 -0700


On Jun 30, 2009, at 4:54 AM, Renzi Fabrizio wrote:

I would like to know how R computes the probability of an event in a
logistic model (P(y=1)) from the score s, linear combination of x and
beta.
I noticed that there are differences (small, less than e-16) betweenthefitting values automatically computed in the glm procedure by R, andthe
values "manually" computed by me applying the reverse formula
p=e^s/(1+e^s); moreover I noticed that the minimum value of thefittingvalues in my estimation is 2.220446e-16, and there are manyobservationwith this probability (instead the minimum value obtained by"manually"
estimation is 2.872636e-152).

It would be helpful to see at least a subset of the output from yourmodel and your manual computations so that we can at least visuallycompare the results to see where the differences may be.

The model object returned from using glm() will contain both thelinear predictors on the link scale (model$linear.predictors) and thefitted values (model$fitted.values). The latter can be accessed usingthe fitted() extractor function.

To use an example, let's create a simple LR model using the infertdata set as referenced in ?glm.

model1 <- glm(case ~ spontaneous + induced, data = infert, family =binomial())


> model1

Call: glm(formula = case ~ spontaneous + induced, family =binomial(), data = infert)


Coefficients:
(Intercept)  spontaneous      induced
    -1.7079       1.1972       0.4181

Degrees of Freedom: 247 Total (i.e. Null);  245 Residual
Null Deviance:      316.2
Residual Deviance: 279.6        AIC: 285.6




# Get the coefficients
> coef(model1)
(Intercept) spontaneous     induced
 -1.7078601   1.1972050   0.4181294



# get the linear predictor values
# log odds scale for binomial glm
> head(model1$linear.predictors)
          1           2           3           4           5           6
 1.10467939 -1.28973068 -0.87160128 -0.87160128 -0.09252564  0.32560375

You can also get the above by using the coefficients and the modelmatrix for comparison:


# the full set of 248
# coef(model1) %*% t(model.matrix(model1))

> head(as.vector(coef(model1) %*% t(model.matrix(model1))))

[1] 1.10467939 -1.28973068 -0.87160128 -0.87160128 -0.092525640.32560375




# get fitted response values (predicted probs)
> head(fitted(model1))
        1         2         3         4         5         6
0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893

We can also get the fitted values from the linear predictor values byusing:


LP <- model1$linear.predictors

> head(exp(LP) / (1 + exp(LP)))
        1         2         3         4         5         6
0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893

You can also get the above by using the predict.glm() function withtype == "response". The default type of "link" will get you the linearpredictor values as above. predict.glm() would typically be used togenerate predictions using the model on new data.


> head(predict(model1, type = "response"))
        1         2         3         4         5         6
0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893

In glm.fit(), which is the workhorse function in glm(), the fittedvalues returned in the model object are actually computed by using theinverse link function for the family passed to glm():


> binomial()$linkinv
function (eta)
.Call("logit_linkinv", eta, PACKAGE = "stats")
<environment: 0x171c8b10>


Thus:

> head(binomial()$linkinv(model1$linear.predictors))
        1         2         3         4         5         6
0.7511359 0.2158984 0.2949212 0.2949212 0.4768851 0.5806893

So those are the same values as we saw above using the other methods.So, all is consistent across the various methods.

Perhaps the above provides some insights for you into how R does someof these things and also to point out, as is frequently the case withR, there is more than one way to get the same result.


HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting in logistic model

Reply via email to