On Jun 18, 2010, at 7:54 PM, David Jarvis wrote:
Hi,
Standard correlations (Pearson's, Spearman's, Kendall's Tau) do not
accurately reflect how closely the model (GAM) fits the data. I was
told
that the accuracy of the correlation can be improved using a root mean
square deviation (RMSD) calculation on binned data.
By whom? ... and with what theoretical basis?
For example, let 'o' be the real, observed data and 'm' be the model
data. I
believe I can calculate the root mean squared deviation as:
sqrt( mean( o - m ) ^ 2 )
However, this does not bin the data into mean sets. What I would
like to do
is:
oangry <- c( mean(o[1:5]), mean(o[6:10]), ... )
mangry <- c( mean(m[1:5]), mean(m[6:10]), ... )
Then:
sqrt( mean( oangry - mangry ) ^ 2 )
That calculation I would like to simplify into (or similar to):
sqrt( mean( bin( o, 5 ) - bin( m, 5 ) ) ^ 2 )
I doubt that your strategy offers any statistical advantage, but if
you want to play around with it then consider:
binned.x <- round( (x + 2.5)/5)
--
David.
I have read the help for ?cut, ?table, ?hist, and ?split, but am
stumped for
which one to use in this case--if any.
How do you calculate c( mean(o[1:5]), mean(o[6:10]), ... ) for an
arbitrary
length vector using an appropriate number of bins (fixed at 5, or
perhaps
calculated using Sturges' formula)?
I have also posted a more detailed version of this question on
StackOverflow:
http://stackoverflow.com/questions/3073365/root-mean-square-deviation-on-binned-gam-results-using-r
Many thanks.
Dave
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.