I'm trying to understand how the earth package treats linearly dependent regressors. I was surprised when switching between two linearly-dependent terms gave different results. Here's an example:
> library(earth) > cars2 <- transform(cars, speed2=100-speed) > earth(dist ~ speed, data=cars2) Selected 3 of 7 terms, and 1 of 1 predictors Importance: speed Number of terms at each degree of interaction: 1 2 (additive model) GCV 255.4974 RSS 10347.64 GRSq 0.622945 RSq 0.6819924 > earth(dist ~ speed2, data=cars2) Selected 3 of 7 terms, and 1 of 1 predictors Importance: speed2 Number of terms at each degree of interaction: 1 2 (additive model) GCV 246.9339 RSS 10000.82 GRSq 0.6355828 RSq 0.692651 Naively, I expected these two fits to be identical, since for each step of the forward pass, the hinge functions considered for speed2 are the same as for speed, just exchanged and with a different constant. Then, for the backward pass, each component should add the same amount to the GCV. Can anyone shed any light on what's going on here? Thanks, Johann ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.