Here is the results of the logistic regression model. Is it because of the NA values?
Call: glm(formula = TARGET_A ~ Contract + Dependents + DeviceProtection + gender + InternetService + MonthlyCharges + MultipleLines + OnlineBackup + OnlineSecurity + PaperlessBilling + Partner + PaymentMethod + PhoneService + SeniorCitizen + StreamingMovies + StreamingTV + TechSupport + tenure + TotalCharges, family = binomial(link = "logit"), data = churn_training) Deviance Residuals: Min 1Q Median 3Q Max -1.8943 -0.6867 -0.2863 0.7378 3.4259 Coefficients: (7 not defined because of singularities) Estimate Std. Error z value Pr(>|z|) (Intercept) 1.0664928 1.7195494 0.620 0.5351 ContractOne year -0.6874005 0.1314227 -5.230 1.69e-07 *** ContractTwo year -1.2775385 0.2101193 -6.080 1.20e-09 *** DependentsYes -0.1485301 0.1095348 -1.356 0.1751 DeviceProtectionNo internet service -1.5547306 0.9661837 -1.609 0.1076 DeviceProtectionYes 0.0459115 0.2114253 0.217 0.8281 genderMale -0.0350970 0.0776896 -0.452 0.6514 InternetServiceFiber optic 1.4800374 0.9545398 1.551 0.1210 InternetServiceNo NA NA NA NA MonthlyCharges -0.0324614 0.0379646 -0.855 0.3925 MultipleLinesNo phone service 0.0808745 0.7736359 0.105 0.9167 MultipleLinesYes 0.3990450 0.2131343 1.872 0.0612 . OnlineBackupNo internet service NA NA NA NA OnlineBackupYes -0.0328892 0.2081145 -0.158 0.8744 OnlineSecurityNo internet service NA NA NA NA OnlineSecurityYes -0.2760602 0.2132917 -1.294 0.1956 PaperlessBillingYes 0.3509944 0.0890884 3.940 8.15e-05 *** PartnerYes 0.0306815 0.0940650 0.326 0.7443 PaymentMethodCredit card (automatic) -0.0710923 0.1377252 -0.516 0.6057 PaymentMethodElectronic check 0.3074078 0.1137939 2.701 0.0069 ** PaymentMethodMailed check -0.0201076 0.1377539 -0.146 0.8839 PhoneServiceYes NA NA NA NA SeniorCitizen 0.1856454 0.1023527 1.814 0.0697 . StreamingMoviesNo internet service NA NA NA NA StreamingMoviesYes 0.5260087 0.3899615 1.349 0.1774 StreamingTVNo internet service NA NA NA NA StreamingTVYes 0.4781321 0.3905777 1.224 0.2209 TechSupportNo internet service NA NA NA NA TechSupportYes -0.2511197 0.2181612 -1.151 0.2497 tenure -0.0702813 0.0077113 -9.114 < 2e-16 *** TotalCharges 0.0004276 0.0000874 4.892 9.97e-07 *** On Thu, Mar 10, 2016 at 4:05 PM, David Winsemius <dwinsem...@comcast.net> wrote: > > > On Mar 10, 2016, at 8:08 AM, Michael Artz <michaelea...@gmail.com> > wrote: > > > > HI all, > > I have the following error - > >> resultVector <- predict(logitregressmodel, dataset1, type='response') > > Warning message: > > In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type == > : > > prediction from a rank-deficient fit may be misleading > > It wasn't an R error. It was an R warning. Was the `summary` output on > logitregressmodel informative? Does the resultVector look sensible given > its inputs? > > > > I have seen on internet that there may be some collinearity in the data > and > > this is causing that. How can I be sure? > > Do some diagnostics. After looking carefully at the output of > summary(logitregressmodel) and perhaps summary(dataset1) if it was the > original input to the modeling functions, and then you could move on to > looking at cross-correlations on things you think are continuous and > crosstabs on factor variables and the condition number on the full data > matrix. > > Lots of stuff turns up on search for "detecting collinearity condition > number in r" > > > > > Thanks > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.