Solved it!! ;) The problem was that the test-data contained factor-levels the training-data didn't. So when trying to fit this new factor-levels to a model which didn't have this levels, the error occurred.
When excluding the factor-levels not used when fitting the model or when taking care all levels are used for modell-fitting, everything is fine. All the best Julian -----Ursprüngliche Nachricht----- Von: Simon Wood [mailto:s.w...@bath.ac.uk] Gesendet: Dienstag, 9. Juli 2013 09:07 An: julian.bo...@elitepartner.de Cc: r-help@r-project.org Betreff: Re: [R] error in "predict.gam" used with "bam" Hi Julian, Any chance you could send me (offline) a short version of your data, which reproduces the problem? I can't reproduce it in a quick attempt (but it is quite puzzling, given that bam calls predict.gam internally in pretty much the same way that you are doing here). btw (and nothing to do with the error) given that you are using R 3.0.1 it's a good idea to upgrade to mgcv_1.7-23 or above, for the following reason (taken from the mgcv changeLog) 1.7-23 ------ *** Fix of severe bug introduced with R 2.15.2 LAPACK change. The shipped version of dsyevr can fail to produce orthogonal eigenvectors when uplo='U' (upper triangle of symmetric matrix used), as opposed to 'L'. This led to a substantial number of gam smoothing parameter estimation convergence failures, as the key stabilizing re-parameterization was substantially degraded. The issue did not affect gaussian additive models with GCV model selection. Other models could fail to converge any further as soon as any smoothing parameter became `large', as happens when a smooth is estimated as a straight line. check.gam reported the lack of full convergence, but the issue could also generate complete fit failures. Picked up late as full test suite had only been run on R > 2.15.1 with an external LAPACK. best, Simon On 08/07/13 10:02, julian.bo...@elitepartner.de wrote: > Hello everyone. > > > > I am doing a logistic gam (package mgcv) on a pretty large dataframe > (130.000 cases with 100 variables). > > Because of that, the gam is fitted on a random subset of 10000. Now > when I want to predict the values for the rest of the data, I get the > following > error: > > > > > >> gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1, > > + > newdata=activisale_join[gam.basis_alleakti.1.complete_cases,all.vars(g > am.b > asis_alleakti.1.formula)],type="response") > > Error in predict.gam(gam.basis_alleakti.1, newdata = > activisale_join[gam.basis_alleakti.1.complete_cases, : > > number of items to replace is not a multiple of replacement length > > > > > > The following is the code: > > #formula with some factors and a lot of variables to be fitted > > gam.basis_alleakti.1.formula=as.formula( paste("verlängerung ~, > > paste( names(activisale_join)[c(2:10)], collapse="+"), > ##factors > > > paste("s(",names(activisale_join)[c(17,19:29,31:42,44)],")", > collapse="+")) # numeric variables, all count data > > ) > > > > # complete cases > > gam.basis_alleakti.1.complete_cases = > complete.cases(activisale_join[,all.vars(gam.basis_alleakti.1.formula) > ]) > > > > # modell fitting works on random subset > > gam.basis_alleakti.1=bam(gam.basis_alleakti.1.formula, > > data = activisale_join[subset.10000, ], > family= > "binomial") > > > > # error, no idea why > > gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1, > newdata=activisale_join[gam.basis_alleakti.1.complete_cases, > ],type="response") > > > > > > the prediction on the same subset (subset.10000) works. > > > > > > It could be that this error is somewhat similar to that described as > sidequestion in > > http://r.789695.n4.nabble.com/gamm-tensor-product-and-interaction-td45 > 2618 8.html, where simon answered the following: > > > > > Here is the error message I obtain: >> > vis.gam(gm1$gam,plot.type="contour",n.grid=200,color="heat",zlim=c(0,4 > )) >> Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) : > number of items to replace is not a multiple of replacement length > - hmm, possibly a bug. I'll look into it. > > best, > Simon > > > > All the best > > > > Julian > > > > Ps.: > version > _ > platform x86_64-w64-mingw32 > arch x86_64 > os mingw32 > system x86_64, mingw32 > status > major 3 > minor 0.1 > year 2013 > month 05 > day 16 > svn rev 62743 > language R > version.string R version 3.0.1 (2013-05-16) > nickname Good Sport > > > > package mgcv version 1.7-22 > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.