Dear all,

Probably I made a beginners mistake. While importing a spss file I didn't 
specify that missings should be NA (use.missings = TRUE). Thanks to Petr Pikal 
and Bert Gunter I now know how to check how many values are known within a 
variable.

Although I can fit my logistic model on this dataset, unfortunately, I 
experience the same problem after bootstrapping the original dataset at hand.

The R-code so far:

bootstraps<-10

subsets<-list()
for (i in 1:bootstraps){
    subsets[[i]]<-as.matrix(sample(1:length(dat$PatID), replace=TRUE))
    }
    subsets<-lapply (subsets, function (x) {subsets <- dat[x,]})
    
fit.subsets <-lapply (subsets, function (x) {lrm(MRI_Diag_RC ~ factor(O4_1r) + 
N6_1r + leeftijd + LO1 + LO2, model=T, x=T, y=T, data=x)})

Everything is fine till I run the last line. The following result shows in R: 
Error in catg(xi, name = nam, label = lab) : LO2 has <2 category levels

I checked the simulated datasets how many values within LO2 are known, using:
lapply (subsets, function (x) {str(x$LO2)})

The result:
Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 NA 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 NA 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
 Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

It would be great to receive ideas, comments or questions about my challenge.

Kind regards, Tobias


-----Oorspronkelijk bericht-----
Van: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Verzonden: vrijdag 7 september 2012 16:22
Aan: Berg, Tobias van den
CC: r-help
Onderwerp: RE: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2 
category levels

Hi

It is good to cc to list. Somebody could have better insight.


> 
> Dear Petr,
> 
> Thank you for responding. It seems right what you say. The funny thing
> however is that the 'LO2' variable in SPSS has 2 answer categories. If
> I look at the same variable in R, again I see 2 different values.

How do you know? Any command? You shall provide at least 

str(LO2)

result as we do not have access to your PC.

> 
> I used your "sapply" code and guess that I retrieved (per variable) the
> amount of answer categories/possible values. LO2 scores a 3 in the
> accompanying results. Do you know how I can change that?

Hm. Result of this depends on what is LO2. If it is numeric, you have 3 unique 
values. If it is factor you can have either 3 levels or 2 levels and NA 
values(again str result would be helpful and we need not just guess how your 
data look like). Well let me guess

levels(dat$LO2) says you have 3 levels 2 meaningful and one comes out probably 
as empty string "".

It shall be the first level so

levels(dat$LO2)[1] <- NA

shall drop this unused and created levels. Or maybe you can get rid of this 
unwanted levels by setting na.string to empty string during import, however my 
knowledge of SPSS limitedly approaching zero so I could be completely wrong.

If your values are factors, you can change the code to

sapply(sapply(ff, levels), length)

and you will get 0 for numeric variables and number of levels for factor 
variables. More complete insight in your data can be also found by 

summary(dat)

Regards
Petr


> 
> Kind regards, Tobias
> 
> 
> -----Oorspronkelijk bericht-----
> Van: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Verzonden: vrijdag 7 september 2012 15:02
> Aan: Berg, Tobias van den; r-help@r-project.org
> Onderwerp: RE: [R] error: in catg (xi, name=nam, label=lab): "LO2" has
> <2 category levels
> 
> Hi
> 
> > -----Original Message-----
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Tvandenberg
> > Sent: Friday, September 07, 2012 1:05 PM
> > To: r-help@r-project.org
> > Subject: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2
> > category levels
> >
> > Dear R-users,
> >
> > During a fit procedure in a  Logistic prediction model I encounter
> the
> > following problem:
> >
> > error: in catg (xi, name=nam, label=lab: X has <2 category levels
> 
> I do not know lrm but the error seems to be explaining itself, some
> variable has only one level and shall have 2
> 
> sapply(sapply(dat, unique), length)
> 
> shall give you for used variables value 2 or more.
> 
> Regards
> Petr
> 
> 
> >
> > The following code is used:
> >
> > fit <-lrm(MRI_Diag_RC ~ factor(O4_1r) + N6_1r + leeftijd + LO1 + LO2
> +
> > LO3+
> > LO4+ LO5+ LO6+ LO7+ LO8+ LO9+ LO10+ LO11+ LO12+ LO13 + LO14+ LO15+
> > LO16+
> > LO17+ LO18+ LO19+ LO20+ LO21+ LO22+ LO23+ LO24 + LO26+ LO27 + LO29,
> > LO17+ LO18+ LO19+ LO20+ LO21+ LO22+ LO23+ model=T,
> > x=T, y=T, data=dat)
> >
> > Most predictors are (dichotomous) nominal variables as is the
> > problematic "LO2". Does anyone know what the problem is and how I can
> > correct it?
> >
> > Kind regards,
> >
> > Tobias
> >
> >
> >
> > --
> > View this message in context: http://r.789695.n4.nabble.com/error-in-
> > catg-xi-name-nam-label-lab-LO2-has-2-category-levels-tp4642495.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html and provide commented, minimal, self-contained,
> > reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to