Two possible fixes occur to me 1) Redo the test/training split but within levels of factor - so you have the same split within each level and each level accounted for in training and testing
2) if you have a lot of levels, and perhaps sparse representation in a few, consider recoding levels to pool the rare ones into an “other” category On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter <bgunter.4...@gmail.com> wrote: > small reprex: > > set.seed(5) > dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) > newdat <- data.frame(f =rep(c('r','g','b'),2)) > ## convert values in newdat not seen in dat to NA > is.na(newdat$f) <-!( newdat$f %in% dat$f) > lmfit <- lm(y~f, data = dat) > > ##Result: > > predict(lmfit,newdat) > 1 2 3 4 5 6 > 0.4374251 0.6196527 NA 0.4374251 0.6196527 NA > > If this does not suffice, as Rui said, we need details of what you did. > (predict.glm works like predict.lm) > > > -- Bert > > > On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas <ruipbarra...@sapo.pt> wrote: > > > > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu: > > > Dear Bert, > > > > > > Yes, was trying to fill the not existing categories with NAs, but the > > > suggested solutions in stackoverflow.com unfortunately did not work. > > > > > > Best regards > > > Gabor > > > > > > > > > Bert Gunter <bgunter.4...@gmail.com> schrieb am So., 20. Nov. 2022, > 16:20: > > > > > >> You can't predict results for categories that you've not seen before > > >> (think about it). You will need to remove those cases from your test > set > > >> (or convert them to NA and predict them as NA). > > >> > > >> -- Bert > > >> > > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki < > gmalomsoki1...@gmail.com> > > >> wrote: > > >> > > >>> Dear all, > > >>> > > >>> i have created a logistic regression model, > > >>> on the train df: > > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = > > >>> "binomial") > > >>> > > >>> then i try to predict with the test df > > >>> Predict<- predict(mymodel1, newdata = test, type = "response") > > >>> then iget this error message: > > >>> Error in model.frame.default(Terms, newdata, na.action = na.action, > xlev = > > >>> object$xlevels) > > >>> Factor "TG_KraftF5" has new levels > > >>> > > >>> i have tried different proposals from stackoverflow, but > unfortunately > > >>> they > > >>> did not solved the problem. > > >>> Do you have any idea how to test a logistic regression model when > you have > > >>> different levels in train and in test df? > > >>> > > >>> thank you in advance > > >>> Regards, > > >>> Gabor > > >>> > > >>> [[alternative HTML version deleted]] > > >>> > > >>> ______________________________________________ > > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>> PLEASE do read the posting guide > > >>> http://www.R-project.org/posting-guide.html > > >>> and provide commented, minimal, self-contained, reproducible code. > > >>> > > >> > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > hello, > > > > What exactly didn't work? You say you have tried the solutions found in > > stackoverflow but without a link, we don't know which answers to which > > questions you are talking about. > > Like Bert said, if you assign NA to the new levels, present only in > > test, it should work. > > > > Can you post links to what you have tried? > > > > Hope this helps, > > > > Rui Barradas > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Sent from Gmail Mobile [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.