Dear all,

As David mentioned, I used his R-code to try to see the dimension behind the 
'LO2'  variable. These are the results:

> lapply (subsets, function (x) {table(x$LO2)})
[[1]]
   nee geen atrofie ja atrofie aanweizg 
                173                   0 
[[2]]
   nee geen atrofie ja atrofie aanweizg 
                169                   3 
[[3]]
   nee geen atrofie ja atrofie aanweizg 
                174                   0 
[[4]]
   nee geen atrofie ja atrofie aanweizg 
                172                   2 
[[5]]
   nee geen atrofie ja atrofie aanweizg 
                173                   2 
[[6]]
   nee geen atrofie ja atrofie aanweizg 
                171                   3 
[[7]]
   nee geen atrofie ja atrofie aanweizg 
                167                   5 
[[8]]
   nee geen atrofie ja atrofie aanweizg 
                174                   1 
[[9]]
   nee geen atrofie ja atrofie aanweizg 
                173                   1 
[[10]]
   nee geen atrofie ja atrofie aanweizg 
                175                   0 

I guess that the lrm model doesn't work - as I tried to model each subset 
separately, and it didn't work in subsets 1, 3 and 10 - because there are no 
persons in one of the two categories. Therefore this LO2 variable seems unable 
to be a predictor - let alone a strong predictor. Regardless of this, it seems 
strange that with a lot of simulations in which there is always a change that a 
specific variable by chance alone will consist of objects with only one 
category gives problems with estimating the prediction models. Does anyone have 
a suggestion how to deal with that?

Kind regards and thanks for all the help so far, 
Tobias


________________________________________
Van: David Winsemius [dwinsem...@comcast.net]
Verzonden: vrijdag 7 september 2012 18:17
Aan: Berg, Tobias van den
CC: PIKAL Petr; r-help
Onderwerp: Re: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2 
category levels

On Sep 7, 2012, at 8:03 AM, Berg, Tobias van den wrote:

> Dear all,
>
> Probably I made a beginners mistake. While importing a spss file I didn't 
> specify that missings should be NA (use.missings = TRUE). Thanks to Petr 
> Pikal and Bert Gunter I now know how to check how many values are known 
> within a variable.
>
> Although I can fit my logistic model on this dataset, unfortunately, I 
> experience the same problem after bootstrapping the original dataset at hand.
>
> The R-code so far:
>
> bootstraps<-10
>
> subsets<-list()
> for (i in 1:bootstraps){
>    subsets[[i]]<-as.matrix(sample(1:length(dat$PatID), replace=TRUE))
>    }
>    subsets<-lapply (subsets, function (x) {subsets <- dat[x,]})
>
> fit.subsets <-lapply (subsets, function (x) {lrm(MRI_Diag_RC ~ factor(O4_1r) 
> + N6_1r + leeftijd + LO1 + LO2, model=T, x=T, y=T, data=x)})
>
> Everything is fine till I run the last line. The following result shows in R: 
> Error in catg(xi, name = nam, label = lab) : LO2 has <2 category levels
>
> I checked the simulated datasets how many values within LO2 are known, using:
> lapply (subsets, function (x) {str(x$LO2)})

Instead do :

apply (subsets, function (x) {table(x$LO2)})

You cannot tell what distribution of values you are getting with str(). Just 
because a factor has 2 levels does NOT mean it has two unique values populating 
those levels.

--
David.

>
> The result:
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 NA 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 NA 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> Factor w/ 2 levels "nee geen atrofie",..: 1 1 1 1 1 1 1 1 1 1 ...
> [[1]]
> NULL
>
> [[2]]
> NULL
>
> [[3]]
> NULL
>
> [[4]]
> NULL
>
> [[5]]
> NULL
>
> [[6]]
> NULL
>
> [[7]]
> NULL
>
> [[8]]
> NULL
>
> [[9]]
> NULL
>
> [[10]]
> NULL
>
> It would be great to receive ideas, comments or questions about my challenge.
>
> Kind regards, Tobias
>
>
> -----Oorspronkelijk bericht-----
> Van: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Verzonden: vrijdag 7 september 2012 16:22
> Aan: Berg, Tobias van den
> CC: r-help
> Onderwerp: RE: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2 
> category levels
>
> Hi
>
> It is good to cc to list. Somebody could have better insight.
>
>
>>
>> Dear Petr,
>>
>> Thank you for responding. It seems right what you say. The funny thing
>> however is that the 'LO2' variable in SPSS has 2 answer categories. If
>> I look at the same variable in R, again I see 2 different values.
>
> How do you know? Any command? You shall provide at least
>
> str(LO2)
>
> result as we do not have access to your PC.
>
>>
>> I used your "sapply" code and guess that I retrieved (per variable) the
>> amount of answer categories/possible values. LO2 scores a 3 in the
>> accompanying results. Do you know how I can change that?
>
> Hm. Result of this depends on what is LO2. If it is numeric, you have 3 
> unique values. If it is factor you can have either 3 levels or 2 levels and 
> NA values(again str result would be helpful and we need not just guess how 
> your data look like). Well let me guess
>
> levels(dat$LO2) says you have 3 levels 2 meaningful and one comes out 
> probably as empty string "".
>
> It shall be the first level so
>
> levels(dat$LO2)[1] <- NA
>
> shall drop this unused and created levels. Or maybe you can get rid of this 
> unwanted levels by setting na.string to empty string during import, however 
> my knowledge of SPSS limitedly approaching zero so I could be completely 
> wrong.
>
> If your values are factors, you can change the code to
>
> sapply(sapply(ff, levels), length)
>
> and you will get 0 for numeric variables and number of levels for factor 
> variables. More complete insight in your data can be also found by
>
> summary(dat)
>
> Regards
> Petr
>
>
>>
>> Kind regards, Tobias
>>
>>
>> -----Oorspronkelijk bericht-----
>> Van: PIKAL Petr [mailto:petr.pi...@precheza.cz]
>> Verzonden: vrijdag 7 september 2012 15:02
>> Aan: Berg, Tobias van den; r-help@r-project.org
>> Onderwerp: RE: [R] error: in catg (xi, name=nam, label=lab): "LO2" has
>> <2 category levels
>>
>> Hi
>>
>>> -----Original Message-----
>>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>>> project.org] On Behalf Of Tvandenberg
>>> Sent: Friday, September 07, 2012 1:05 PM
>>> To: r-help@r-project.org
>>> Subject: [R] error: in catg (xi, name=nam, label=lab): "LO2" has <2
>>> category levels
>>>
>>> Dear R-users,
>>>
>>> During a fit procedure in a  Logistic prediction model I encounter
>> the
>>> following problem:
>>>
>>> error: in catg (xi, name=nam, label=lab: X has <2 category levels
>>
>> I do not know lrm but the error seems to be explaining itself, some
>> variable has only one level and shall have 2
>>
>> sapply(sapply(dat, unique), length)
>>
>> shall give you for used variables value 2 or more.
>>
>> Regards
>> Petr
>>
>>
>>>
>>> The following code is used:
>>>
>>> fit <-lrm(MRI_Diag_RC ~ factor(O4_1r) + N6_1r + leeftijd + LO1 + LO2
>> +
>>> LO3+
>>> LO4+ LO5+ LO6+ LO7+ LO8+ LO9+ LO10+ LO11+ LO12+ LO13 + LO14+ LO15+
>>> LO16+
>>> LO17+ LO18+ LO19+ LO20+ LO21+ LO22+ LO23+ LO24 + LO26+ LO27 + LO29,
>>> LO17+ LO18+ LO19+ LO20+ LO21+ LO22+ LO23+ model=T,
>>> x=T, y=T, data=dat)
>>>
>>> Most predictors are (dichotomous) nominal variables as is the
>>> problematic "LO2". Does anyone know what the problem is and how I can
>>> correct it?
>>>
>>> Kind regards,
>>>
>>> Tobias
>>>
>>>
>>>
>>> --
>>> View this message in context: http://r.789695.n4.nabble.com/error-in-
>>> catg-xi-name-nam-label-lab-LO2-has-2-category-levels-tp4642495.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html and provide commented, minimal, self-contained,
>>> reproducible code.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to