Ok Thank you for your time.
Best regards Petr Pikal Duncan Murdoch <murd...@stats.uwo.ca> napsal dne 19.08.2009 16:29:07: > On 8/19/2009 10:14 AM, Petr PIKAL wrote: > > Duncan Murdoch <murd...@stats.uwo.ca> napsal dne 19.08.2009 15:25:00: > > > >> On 19/08/2009 9:02 AM, Petr PIKAL wrote: > >> > Thank you > >> > > >> > Duncan Murdoch <murd...@stats.uwo.ca> napsal dne 19.08.2009 14:49:52: > >> > > >> >> On 19/08/2009 8:31 AM, Petr PIKAL wrote: > >> >>> Dear all > >> >>> > >> > > >> > <snip> > >> > > >> >> I would say the answer depends on the meaning of the variables. In > > the > >> >> unusual case that they are measured in dimensionless units, it might > >> >> make sense not to scale. But if you are using arbitrary units of > >> >> measurement, do you want your answer to depend on them? For example, > > if > >> > > >> >> you change from Kg to mg, the numbers will become much larger, the > >> >> variable will contribute much more variance, and it will become a > > more > >> >> important part of the largest principal component. Is that sensible? > >> > > >> > Basically variables are in percentages (all between 0 and 6%) except > > dus > >> > which is present or not present (for the purpose of prcomp transformed > > to > >> > 0/1 by as.numeric:). The only variable which is not such is iep which > > is > >> > basically in range 5-8. So ranges of all variables are quite similar. > >> > > >> > What surprises me is that biplot without scaling I can interpret by > > used > >> > variables while biplot with scaling is totally different and those two > > > >> > pictures does not match at all. This is what surprised me as I would > >> > expected just a small difference between results from those two > > settings > >> > as all numbers are quite comparable and does not differ much. > >> > >> > >> If you look at the standard deviations in the two cases, I think you can > > > >> see why this happens: > >> > >> Scaled: > >> > >> Standard deviations: > >> [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397 > >> > >> Not Scaled: > >> > >> Standard deviations: > >> [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582 > >> > >> > >> The first two sds are close, so small changes to the data will affect > > > > I see. But I would expect that changes to data made by scaling would not > > change it in such a way that unscaled and scaled results are completely > > different. > > > >> their direction a lot. Your biplots look at the 2nd and 3rd components. > > > > Yes because grouping in 2nd and 3rd component biplot can be easily > > explained by values of some variables (without scaling). > > > > I must admit that I do not use prcomp much often and usually scaling can > > give me "explainable" result, especially if I use it to "variable > > reduction". Therefore I am reluctant to use it in this case. > > > > when I try "more standard" way > > > >> fit<-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp) > >> summary(fit) > > > > Call: > > lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp) > > > > Residuals: > > Min 1Q Median 3Q Max > > -0.41751 -0.15568 -0.03613 0.20124 0.43046 > > > > Coefficients: > > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 7.12085 0.62257 11.438 8.24e-08 *** > > sio2 -0.67250 0.20953 -3.210 0.007498 ** > > al2o3 0.40534 0.08641 4.691 0.000522 *** > > p2o5 -0.76909 0.11103 -6.927 1.59e-05 *** > > as.numeric(dus) -0.64020 0.18101 -3.537 0.004094 ** > > > > I get quite plausible result which can be interpreted without problems. > > > > My data is a result of designed experiment (more or less :) and therefore > > all variables are significant. Is that the reason why scaling may bye > > inappropriate in this case? > > No, I think it's just that the cloud of points is approximately > spherical in the first 2 or 3 principal components, so the principal > component directions are somewhat arbitrary. You just got lucky that > the 2nd and 3rd components are interpretable: I wouldn't put too much > faith in being able to repeat that if you went out and collected a new > set of data using the same design. > > Duncan Murdoch ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.