The example you gave had only one split.  If your real situation has three 
splits, you'll have to take a look at testtree$csplit matrix and decide 
how you want to define the new grouping variable.  Here's one way to do it 
...

Jean

library(rpart)
library(rpart.plot)
test_set <- data.frame(
        list_var=paste("A", (1:1000)%/%25, sep=''), 
        list_val=c(runif(250, 1, 4), runif(250, 3, 5), runif(250, 4, 6), 
runif(250, 5, 7))
        )

# a preliminary tree, to get the splits (not plotted)
testtree <- rpart(list_val ~ list_var, minbucket=100, data=test_set) 

# a vector of the unique values of list_var, sorted
suvar <- sort(unique(test_set$list_var))

# define a new variable to represent all combinations of splits in 
testtree
groups <- factor(apply(testtree$csplit, 2, paste, collapse="-"), 
labels=seq(table(splitz)))
# expand this new variable to the length of the original data frame
test_set$var_grp <- as.factor(groups[match(test_set$list_var, suvar)])

# fit another tree, using the grouping variable, for plotting purposes
testtree2 <- rpart(list_val ~ var_grp, data=test_set) 
rpart.plot(testtree2, type=3) 


Mark Beauchene <markbeauch...@hotmail.com> wrote on 07/11/2012 02:34:52 
PM:

> Thank you, it works very well.
> 
> Could you help me out by explaining a little bit of how it works? 
> In my actual plot I have 3 splits on the same long list class 
> variable, and I don't completely follow your code.
> 
> Mark Beauchene
> 
> To: markbeauch...@hotmail.com
> CC: r-help@r-project.org
> Subject: Re: [R] Plotting rpart trees with long list of class members
> From: jvad...@usgs.gov
> Date: Tue, 10 Jul 2012 09:10:05 -0500
> 
> Thanks.  Very helpful. 
> 
> You can use the information from the splits in the first tree, to 
> define a new grouping variable, which will simplify the plot: 
> suvar <- sort(unique(test_set$list_var)) 
> test_set$var_grp <- as.factor(testtree$csplit[match(test_set
> $list_var, suvar)]) 
> testtree2 <- rpart ( list_val ~ var_grp, data = test_set ) 
> rpart.plot(testtree2, type=3) 
> 
> Not to other readers, you will need to load these packages, before 
> running the code: 
> library(rpart) 
> library(rpart.plot) 
> 
> Jean 
> 
> 
> MarkBeauchene <markbeauch...@hotmail.com> wrote on 07/09/2012 03:42:32 
PM:
> > Here is some sample code.  It generates a class (list_var) that is 
used in
> > rpart.  list_val is the dependant variable.
> > 
> > The plot shows all the values of the class, which is a mess and makes 
the
> > plot unuseable.  I'd like to either suppress the list entirely or 
replace it
> > with something like "Group 1", "Group 2", etc.
> > 
> > list_var <- rep(NA,2000)
> > list_val <- rep(NA,2000)
> > for (i in 1:1000) {
> > list_var[i] <- paste("A",i%/%25,sep='')
> > list_val[i] <- runif(1,0,1) }
> > test_set <- data.frame(list_var, list_val )
> > 
> > 
> > 
> > 
> > testtree <- rpart ( list_val ~ list_var, data = test_set )
> > rpart.plot(testtree, type=3)
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to