On 15/08/2014 2:40 am, Peter Langfelder wrote: > On Wed, Aug 13, 2014 at 11:57 PM, Peter Brady > <subscripti...@simonplace.net> wrote: >> > Hi All, >> > >> > I've inherited some R code that I can't work out what they've done. It >> > appears to work and give sort of reasonable answers, I'm just trying to >> > work out why they've done what they have. I suspect that this is a >> > simple vector identity that I've just been staring at too long and have >> > forgotten... >> > >> > The code: >> > >> > GGt <- M0 - M1 %*% M0inv %*% t(M1) >> > svdGG <- svd(GGt) >> > Gmat <- svdGG$u %*% diag(sqrt(svdGG$d)) >> > >> > It is supposed to solve: >> > >> > G*G^T = M0 - M1*M0^-1*M1^T >> > >> > for G, where G^T is the transpose of G. It is designed to reproduce a >> > numerical method described in two papers: >> > >> > Srikanthan and Pegram, Journal of Hydrology, 371 (2009) 142-153, >> > Equation A13, who suggest the SVD method but don't describe the >> > specifics, eg: "...G is found by singular value decomposition..." >> > >> > Alternatively, Matalas (1967) Water Resources Research 3 (4) 937-945, >> > Equation 17, say that the above can be solved using Principle Component >> > Analysis (PCA). >> > >> > I use PCA (specifically POD) and SVD to look at the components after >> > decomposition, so I'm a bit lost as to how the original matrix G can be >> > constructed in this case from only the singular values and the left >> > singular vectors. > GG' is a symmetric matrix, so left- and right-singular vectors are the > same. If I recall right, in general it is impossible to find G from > GG' (I denote the transpose by ') since, given an orthogonal > transformation U (that is, UU'=1), GUU'G' = GG', so you can only find > G up to multiplication with an orthogonal transformation matrix. > > Since SVD decomposes a matrix X = UDV', the decomposition for GG' is > > GG' = UDU'; setting S = sqrt(D) (i.e., diagonal matrix with elements > that are sqrt of those in D), GG' = USSU' = USS'U', so one solution is > G = US which is the solution used. > > You could use PCA on G, which is roughly equivalent to doing SVD on > GG' (up to centering and scaling of the columns of G). I am not very > familiar with PCA in R since I always use SVD, but here's what the > help file for prcomp (PCA in R) says: > > The calculation is done by a singular value decomposition of the > (centered and possibly scaled) data matrix, not by using ‘eigen’ > on the covariance matrix. This is generally the preferred method > for numerical accuracy.
YES! Thank you and to Martyn for the same comment. The identity that I was missing was that for a symmetric matrix the left and right singular values are equal, i.e.: U = V and, by extrapolation, U' == V' Then, back to the text books, original papers and google I get the following: 1) G itself is also symmetric as it is a spatial (and perhaps temporal) correlation matrix. 2) PCA: "we basically try to find eigenvalues and eigenvectors of the covariance matrix, C. We showed that C = (AA') / (n-1), and thus finding the eigenvalues and eigenvectors of C is the same as finding the eigenvalues and eigenvectors of AA'." 3) Further, from some algebra of SVD results it can be shown that from A = U S V' we can get: AA' = U S^2 U' Hence, in my notation scheme the PCA decomposition of GG' via SVD is: GG' = U S^2 U' and the corresponding SVD of G is: G = U S V' but as U'=V' I can substitute: G = U S U' Hence, I can compute G from the SVD of GG'. This, unfortunately leads to the conclusion that there is a bug in the code, which can be verified, and fixed, in the sample code: rm(list=ls()) # Make this repeatable for testing set.seed(12003) # Generate a dummy set of 1000 months of rain at 10 measurement stations tsLength <- 1000; nStations <- 10; syntheticRain <- runif(tsLength * nStations, 0, 10) syntheticRain <- matrix(syntheticRain, nrow = tsLength, ncol = nStations) # Normalise the rainfall before further processing as is standard practice in # statistical hydrology. NB: this step should not matter though for the method # as the correlation will eliminate the scale and means. syntheticRain <- scale(syntheticRain) # Compute the spatial correlation of the stations spatialCor <- cor(syntheticRain) # Our dummy GGt: GGt <- spatialCor %*% t(spatialCor) # Now attempt the back calculation to verify svdGGT <- svd(GGt) spatialCorfromSVD1 <- svdGGT$u %*% diag(sqrt(svdGGT$d)) spatialCorfromSVD2 <- svdGGT$u %*% diag(sqrt(svdGGT$d)) %*% t(svdGGT$u) Thanks very much all -pete -- Peter Brady Email: pdbr...@ans.com.au Skype: pbrady77
signature.asc
Description: OpenPGP digital signature
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.