Great suggestions. I tested the code on an example and the run time was reduced from 1 min 12 sec to 3 sec. Also, I like the suggestion to look at the quantiles. I will see what insight it provides in terms of detecting masked interactions.
I have a couple questions about your code. First, why not use xs <- seq(min(x1), max(x1), length = 100) instead of xs <- with(m, seq(min(x1), max(x1), length = 100)) ? Second, what is the function geom_line()? I couldn't find it. Thanks, Mike On 4/23/08, hadley wickham <[EMAIL PROTECTED]> wrote: > > On Wed, Apr 23, 2008 at 8:33 PM, hadley wickham <[EMAIL PROTECTED]> > wrote: > > > Sure, I am creating a partial dependence plot (reference Friedman's > > > stochastic gradient paper from, I want to say, 2001). The idea is to > find > > > the relationship between one of the predictors, say x1, and y by > creating > > > the following plot: take a random sample of actual data points, hold > other > > > predictors fixed (x2-xp), vary x1 across its range, create a string of > > > > Put your code doesn't have a random component - you're trying to > > calculate everything combination of the new x_n and the existing data? > > Is that right? > > And why are you using so many different values of the x variable? > 100's should be sufficient to get a smooth curve, not thousands. I'd > also think about displaying not just the mean, but a selection of > quantiles as well: > > Here's one approach: > > model <- lm(y ~ poly(x1, 2) + x2, data = m) > > > xs <- with(m, seq(min(x1), max(x1), length = 100)) > > library(reshape) > newdf <- expand.grid.df(data.frame(x1 = xs), m[, c("x2"), drop=F]) > > predictions <- predict(model, newdata = newdf) > avg_pred <- tapply(predictions, newdf$x1, mean) > low_pred <- tapply(predictions, newdf$x1, quantile, 0.25) > high_pred <- tapply(predictions, newdf$x1, quantile, 0.75) > > library(ggplot) > qplot(xs, avg_pred, min = low_pred, max = high_pred, geom="ribbon") + > geom_line() > > > But following your code, it's exhaustive, not random. This should be > a little faster because all the predictions are done in one go. > > Hadley > > -- > http://had.co.nz/ > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.