Re: [R] Recursive partitioning algorithms in R vs. alia

Tobias Verbeke Fri, 19 Jun 2009 23:07:01 -0700

Wensui Liu wrote:

well, how difficult to code random forest with sas macro + proc split?
if you are lack of sas programming skill, then you are correct that
you have to wait for 8 years :-)

It is true one can use the macro language to obtain some control flowthe plain SAS language and its PROCs are missing and for manipulatingmatrices there is even a third language (IML), but my customers preferto leverage community-tested open source implementations as buildingblocks rather than spending unnecessary resources in writing things fromscratch in their corner.

i don't know how much sas experience you have. as far as i know, both
bagging and boosting have been implemented in sas em for a while,
together with other cut-edge modeling tools such as svm / nnet.

Fair enough, but whenever you will need ensemble methods for survivaldata or would like to escape bias in variable importance in presenceof categorical predictors you will (1) not be able to take something offthe shelf and (2) neither to programmatically tweak SAS EM procedures

(as they are not exposed but locked in the GUI), so there again your
only option is to implement things from scratch.

Best,
Tobias

On Fri, Jun 19, 2009 at 4:18 PM, Tobias
Verbeke<tobias.verb...@openanalytics.be> wrote:

Wensui Liu wrote:

in terms of the richness of features and ability to handle large
data(which is normal in bank), SAS EM should be on top of others.

Should be ? That is not at all my experience.
SAS EM is very much lagging behind current
research. You will find variants of random forests
in R that will not be in SAS for the next 8 years,
to give just one example.

however, it is not cheap.
in terms of algorithm, split procedure in sas em can do
chaid/cart/c4.5, if i remember correctly.

These are techniques of the 80s and 90s
(which proves my point). CART is in rpart and
an implementation of C4.5 can be accessed
through RWeka. For the oldest one (CHAID, 1980),
there might be an implementation soon:

http://r-forge.r-project.org/projects/chaid/

but again there have been quite some improvements
in the last decade as well:

http://cran.r-project.org/web/views/MachineLearning.html

HTH,
Tobias

On Fri, Jun 19, 2009 at 2:35 PM, Carlos J. Gil
Bellosta<c...@datanalytics.com> wrote:

Dear R-helpers,

I had a conversation with a guy working in a "business intelligence"
department at a major Spanish bank. They rely on recursive partitioning
methods to rank customers according to certain criteria.

They use both SAS EM and Salford Systems' CART. I have used package R
part in the past, but I could not provide any kind of feature comparison
or the like as I have no access to any installation of the first two
proprietary products.

Has anybody experience with them? Is there any public benchmark
available? Is there any very good --although solely technical-- reason
to pay hefty software licences? How would the algorithms implemented in
rpart compare to those in SAS and/or CART?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Recursive partitioning algorithms in R vs. alia

Reply via email to