Hi, dears, I am processing some data with 60 columns, and 286,730 rows. Most columns are numerical value, and some columns are categorical value.
It turns out that: when ntree sets to the default value (500), it says "can not allocate a vector of 1.1 GB size"; And when I set ntree to be a very small number like 10, it will run for hours. I use the (x,y) rather than the (formula,data). My code: > sdata<-read.csv("D://zSignal Dump//XXXX//XXXX.csv") > sdata1<-subset(sdata,select=-38) > sdata2<-subset(sdata,select=38) > res<-randomForest(x=sdata1,y=sdata2,ntrees=10) Am I doing anything wrong? Or do you have other suggestions? Are there any other packages to do the same thing? I will appreciate if anyone can help me out, thanks! Thanks and Best regards, ------------------------------------------------ Jia, Zou (×Þ¼Î), Ph.D. IBM Research -- China Diamond Building, #19 Zhongguancun Software Park, 8 Dongbeiwang West Road, Haidian District, Beijing 100193, P.R. China Tel: +86 (10) 58748518 E-mail: jia...@cn.ibm.com [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.