Re: [R] CHAID in R

Achim Zeileis Fri, 15 Nov 2013 14:40:02 -0800

On Sat, 16 Nov 2013, Preetam Pal wrote:

Hi,
I have a data set on credit rating for customers in a bank (Rating is 1for defaulter, 0 = non-defaulter). I have 10 predictor variables(C1,C2,.....,C10) . I want to build a CHAID Tree using R forclassification. How do I do this? For your perusal, the data set isattached. Thanks in advance.


The classical CHAID algorithm is implemented in a package on R-Forge:
https://R-Forge.R-project.org/R/?group_id=343

However, this only supports categorical covariates and hence is not usefulfor your data.

Alternatively, you might want to try out other packages for learningclassification trees, e.g., partykit or rpart. See also

http://CRAN.R-project.org/view=MachineLearning

For your data you could do:

## read data with factor response
d <- read.table("text.txt", header = TRUE)
d$Rating <- factor(d$Rating)

## ctree
library("partykit")
ct <- ctree(Rating ~ ., data = d)
plot(ct)

## rpart
library("rpart")
rp <- rpart(Rating ~ ., data = d, control = list(cp = 0.02))
plot(as.party(rp))

## evtree
library("evtree")
set.seed(1)
ev <- evtree(Rating ~ ., data = d, maxdepth = 5)
plot(ev)

All methods agree that the decisive split is in C2 at about -110. Andpossibly you might be able to infer some more splits for the < -110subsample but there the methods disagree somewhat.


Best,
Z

-Preetam

--
Preetam Pal
(+91)-9432212774
M-Stat 2nd Year,                                             Room No. N-114
Statistics Division,                                           C.V.Raman
Hall
Indian Statistical Institute,                                 B.H.O.S.
Kolkata.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] CHAID in R

Reply via email to