I know that this is a quite old post but I am dealing with a similar warning message and, also after reading all the posts here, I am still in doubt with what I should do with my analysis.
I have a dataset where the binary response variable is sex, and the predictors are several variables (they are percentage). I know that one of this variables, "hcp", predicts quite well the sex of the individuals given that the ca. 85% of females have hcp>0 while ca. 85% of females have hcp=0. I don't believe that it does matter, but all the values are very low (the maximum value of a predictor doesn't exceed 0.5) and most of predictors show a very left-skewed distribution (log-normal). In any case, I believe that it could be said that quasi-separation is occurring. When I run the saturated model I get the warning message: ********************* >glm.sat1<-glm(sex~hwp+hcp+hnp+twp+d1lp*d2dp+hwp:hcp+hwp:hnp+hwp:twp+hcp:hnp+hcp:twp+hnp:twp,data=dati,family="binomial") Mensajes de aviso perdidos glm.fit: fitted probabilities numerically 0 or 1 occurred > summary(glm.sat1) Call: glm(formula = sex ~ hwp + hcp + hnp + twp + d1lp * d2dp + hwp:hcp + hwp:hnp + hwp:twp + hcp:hnp + hcp:twp + hnp:twp, family = "binomial", data = dati) Deviance Residuals: Min 1Q Median 3Q Max -2.6838 -0.1439 0.2699 0.5694 3.8421 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 8.940 4.046 2.210 0.027125 * hwp -4.897 4.550 -1.076 0.281793 hcp -414.796 113.780 -3.646 0.000267 *** hnp -3.689 6.692 -0.551 0.581487 twp 5.450 2.566 2.124 0.033657 * d1lp -22.436 13.837 -1.621 0.104926 d2dp -21.076 9.266 -2.274 0.022940 * d1lp:d2dp 65.102 35.394 1.839 0.065864 . hwp:hcp 313.514 823.782 0.381 0.703516 hwp:hnp -46.572 79.467 -0.586 0.557834 hwp:twp 8.252 21.503 0.384 0.701156 hcp:hnp -1721.320 1925.766 -0.894 0.371409 hcp:twp -94.153 459.710 -0.205 0.837721 hnp:twp 16.598 38.548 0.431 0.666769 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 581.51 on 425 degrees of freedom Residual deviance: 281.98 on 412 degrees of freedom (41 observations deleted due to missingness) AIC: 309.98 Number of Fisher Scoring iterations: 8 ********************** The estimates of hcp are very high and its Std.Error are also quite high. If I go further with the analyses and use the "step" procedure I finally get a candidate minimal adequate model: > step.1<-step(glm.sat1) Start: AIC=309.98 sex ~ hwp + hcp + hnp + twp + d1lp * d2dp + hwp:hcp + hwp:hnp + hwp:twp + hcp:hnp + hcp:twp + hnp:twp Df Deviance AIC - hcp:twp 1 282.02 308.02 - hwp:hcp 1 282.11 308.11 - hwp:twp 1 282.13 308.13 - hnp:twp 1 282.17 308.17 - hwp:hnp 1 282.32 308.32 - hcp:hnp 1 282.90 308.90 <none> 281.98 309.98 - d1lp:d2dp 1 285.43 311.43 Step: AIC=308.02 sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp + hwp:hcp + hwp:hnp + hwp:twp + hcp:hnp + hnp:twp Df Deviance AIC - hwp:hcp 1 282.13 306.13 - hwp:twp 1 282.14 306.14 - hnp:twp 1 282.18 306.18 - hwp:hnp 1 282.43 306.43 - hcp:hnp 1 283.07 307.07 <none> 282.02 308.02 - d1lp:d2dp 1 285.43 309.43 Step: AIC=306.13 sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp + hwp:hnp + hwp:twp + hcp:hnp + hnp:twp Df Deviance AIC - hwp:twp 1 282.24 304.24 - hnp:twp 1 282.27 304.27 - hwp:hnp 1 282.53 304.53 - hcp:hnp 1 283.25 305.25 <none> 282.13 306.13 - d1lp:d2dp 1 285.46 307.46 Step: AIC=304.24 sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp + hwp:hnp + hcp:hnp + hnp:twp Df Deviance AIC - hnp:twp 1 282.35 302.35 - hwp:hnp 1 282.83 302.83 - hcp:hnp 1 283.36 303.36 <none> 282.24 304.24 - d1lp:d2dp 1 285.56 305.56 Step: AIC=302.35 sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp + hwp:hnp + hcp:hnp Df Deviance AIC - hwp:hnp 1 283.06 301.06 - hcp:hnp 1 283.37 301.37 <none> 282.35 302.35 - d1lp:d2dp 1 285.71 303.71 - twp 1 306.17 324.17 Step: AIC=301.06 sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp + hcp:hnp Df Deviance AIC - hcp:hnp 1 284.36 300.36 <none> 283.06 301.06 - d1lp:d2dp 1 286.26 302.26 - hwp 1 286.96 302.96 - twp 1 307.98 323.98 Step: AIC=300.36 sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp Df Deviance AIC <none> 284.36 300.36 - hnp 1 287.07 301.07 - d1lp:d2dp 1 287.80 301.80 - hwp 1 288.08 302.08 - twp 1 308.39 322.39 - hcp 1 484.12 498.12 Hubo 40 avisos (use warnings() para verlos) > summary(step.1) Call: glm(formula = sex ~ hwp + hcp + hnp + twp + d1lp + d2dp + d1lp:d2dp, family = "binomial", data = dati) Deviance Residuals: Min 1Q Median 3Q Max -2.7062 -0.1652 0.2669 0.5909 3.5082 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 8.442 3.925 2.151 0.0315 * hwp -4.692 2.413 -1.945 0.0518 . hcp -464.909 59.109 -7.865 3.68e-15 *** hnp -5.673 3.317 -1.710 0.0872 . twp 6.193 1.428 4.335 1.46e-05 *** d1lp -21.332 13.487 -1.582 0.1137 d2dp -19.983 9.042 -2.210 0.0271 * d1lp:d2dp 63.218 34.530 1.831 0.0671 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 581.51 on 425 degrees of freedom Residual deviance: 284.36 on 418 degrees of freedom (41 observations deleted due to missingness) AIC: 300.36 Number of Fisher Scoring iterations: 8 > library(profileModel) > pp<-profileModel(step.1,quantile=qchisq(0.95,1),objective="ordinaryDeviance") Preliminary iteration ........ Done Profiling for parameter (Intercept) ... Done Profiling for parameter hwp ... Done Profiling for parameter hcp ... Done Profiling for parameter hnp ... Done Profiling for parameter twp ... Done Profiling for parameter d1lp ... Done Profiling for parameter d2dp ... Done Profiling for parameter d1lp:d2dp ... Done ****************************** Plotting the profileModel I get this: > dev.new() > plot(pp) http://r.789695.n4.nabble.com/file/n3850331/Rplot.jpg The hcp plot is far by being quadratic as I understood it should be. I have tried to use the brglm procedure (on saturated model and glm derived minimal adequate model) and I got almost the very same results. Any suggestion/commentary on this? I would appreciate any help, Simone -- View this message in context: http://r.789695.n4.nabble.com/OT-quasi-separation-in-a-logistic-GLM-tp875726p3850331.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.