There's an alternative, but it may not be any more efficient in time or memory...
You can run predict() on the training set once, setting nodes=TRUE. That will give you a n by ntree matrix of which node of which tree the data point falls in. For any new data, you would run predict() with nodes=TRUE, then compute the proximity "by hand" by counting how often any given pair landed in the same terminal node of each tree. Andy > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Kilian > Sent: Wednesday, February 01, 2012 5:39 AM > To: r-help@r-project.org > Subject: [R] randomForest: proximity for new objects using an > existing rf > > Dear all, > > using an existing random forest, I would like to calculate > the proximity > for a new test object, i.e. the similarity between the new > object and the > old training objects which were used for building the random > forest. I do > not want to build a new random forest based on both old and > new objects. > > Currently, my workaround is to calculate the proximites of a > combined data > set consisting of training and new objects like this: > > model <- randomForest(Xtrain, Ytrain) # build random forest > nnew <- nrow(Xnew) # number of new objects > Xcombi <- rbind(Xnew, Xtrain) # combine new objects and > training objects > predcombi <- predict(model, Xcombi, proximity=TRUE) # > calculate proximities > proxcombi <- predcombi$proximity # get proximities of combined dataset > proxnew <- proxcombi[(1:nnew),-(1:nnew)] # get proximities of > new objects > only > > But this approach causes a lot of wasted computation time as I am not > interested in the proximities among the training objects > themselves but > only among the training objects and the new objects. With > 1000 training > objects and 5 new objects, I have to calculate a 1005x1005 > proximity matrix > to get the essential 5x1000 matrix of the new objects only. > > Am I doing something wrong? I read through the documentation > but could not > find another solution. Any advice would be highly appreciated. > > Thanks in advance! > Kilian > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.