Whew, figured it out through trial and error. 

In case anyone else runs into this problem, the issue ended up being with the 
data in one of the columns. I knew I didn't have any actual missing values, but 
one of the columns is a text field which can have the literal value of "NA". I 
guess R was interpreting those as a special case and then running into problems 
later. When I replaced the NA with another value, the classifier now sees the 
right number of rows, and I can run a summary() function fine.


> From: d.dasc...@hotmail.com
> To: r-help@r-project.org
> Date: Wed, 7 Apr 2010 18:44:34 -0400
> Subject: [R] RWeka - Error when attempting to summary() model
> 
> 
> I'm a big fan of both Weka and R (quite new at R :) ), and jumped at the 
> chance to use them together. Unfortunately, I'm running into what is probably 
> a dumb error when trying to view info about my model. A Google search turned 
> up 0 hits for the actual error I got (last line), but you all are smarter!  
> 
> My code is below, but basically my data frame (q) is imported via RODBC and 
> has 1586 rows (as you can see from nrow() ). q$Site is the column I hope to 
> classify by using the JRip classifier. When I view the m object, the model 
> seems to have been trained on a lot fewer rows than expected (10 vs 1586?), 
> and the summary() command fails with the error I mentioned I haven't seen 
> anyone run into. My guess is something is wrong with the specification of the 
> training set, but when I add control=Weka_control(F=1) to specify only one 
> fold, the end result is the same with the degenerate confusion matrix error. 
> Is there some other way I should be forcing it to train on more rows? Is that 
> issue related to not being able to generate a confusion matrix?
> 
> 
> 
> > attach(q)
> 
> > nrow(q)
> 
> [1] 1586
> 
> > summary(Site)
> 
>     A    B    C    D    E    F
> 
>        265        190        260        344        329        198
> 
> > m <- JRip(Site~.,data=q)
> 
> > m
> 
> JRIP rules:
> 
> ===========
> 
>  
> 
> (Dinosaur = TRex) => Site=A (3.0/0.0)
> 
>  => Site=B (5.0/2.0)
> 
>  
> 
> Number of Rules : 2
> 
>  
> 
> > summary(m)
> 
> Error in evaluate_Weka_classifier(object, ...) :
> 
>   Cannot set dimnames on degenerate confusion matrix.
> 
> 
> 
>                                         
> _________________________________________________________________
> Hotmail is redefining busy with tools for the New Busy. Get more from your 
> inbox.
> 
> N:WL:en-US:WM_HMP:042010_2
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
                                          
_________________________________________________________________
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
Hotmail. 

PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to