[R] How Can SVD Reconstruct a Matrix
Peter Brady
subscriptions at simonplace.net
Fri Aug 15 02:52:08 CEST 2014
On 15/08/2014 2:40 am, Peter Langfelder wrote:
> On Wed, Aug 13, 2014 at 11:57 PM, Peter Brady
> <subscriptions at simonplace.net> wrote:
>> > Hi All,
>> >
>> > I've inherited some R code that I can't work out what they've done. It
>> > appears to work and give sort of reasonable answers, I'm just trying to
>> > work out why they've done what they have. I suspect that this is a
>> > simple vector identity that I've just been staring at too long and have
>> > forgotten...
>> >
>> > The code:
>> >
>> > GGt <- M0 - M1 %*% M0inv %*% t(M1)
>> > svdGG <- svd(GGt)
>> > Gmat <- svdGG$u %*% diag(sqrt(svdGG$d))
>> >
>> > It is supposed to solve:
>> >
>> > G*G^T = M0 - M1*M0^-1*M1^T
>> >
>> > for G, where G^T is the transpose of G. It is designed to reproduce a
>> > numerical method described in two papers:
>> >
>> > Srikanthan and Pegram, Journal of Hydrology, 371 (2009) 142-153,
>> > Equation A13, who suggest the SVD method but don't describe the
>> > specifics, eg: "...G is found by singular value decomposition..."
>> >
>> > Alternatively, Matalas (1967) Water Resources Research 3 (4) 937-945,
>> > Equation 17, say that the above can be solved using Principle Component
>> > Analysis (PCA).
>> >
>> > I use PCA (specifically POD) and SVD to look at the components after
>> > decomposition, so I'm a bit lost as to how the original matrix G can be
>> > constructed in this case from only the singular values and the left
>> > singular vectors.
> GG' is a symmetric matrix, so left- and right-singular vectors are the
> same. If I recall right, in general it is impossible to find G from
> GG' (I denote the transpose by ') since, given an orthogonal
> transformation U (that is, UU'=1), GUU'G' = GG', so you can only find
> G up to multiplication with an orthogonal transformation matrix.
>
> Since SVD decomposes a matrix X = UDV', the decomposition for GG' is
>
> GG' = UDU'; setting S = sqrt(D) (i.e., diagonal matrix with elements
> that are sqrt of those in D), GG' = USSU' = USS'U', so one solution is
> G = US which is the solution used.
>
> You could use PCA on G, which is roughly equivalent to doing SVD on
> GG' (up to centering and scaling of the columns of G). I am not very
> familiar with PCA in R since I always use SVD, but here's what the
> help file for prcomp (PCA in R) says:
>
> The calculation is done by a singular value decomposition of the
> (centered and possibly scaled) data matrix, not by using ‘eigen’
> on the covariance matrix. This is generally the preferred method
> for numerical accuracy.
YES! Thank you and to Martyn for the same comment. The identity that I
was missing was that for a symmetric matrix the left and right singular
values are equal, i.e.:
U = V and, by extrapolation, U' == V'
Then, back to the text books, original papers and google I get the
following:
1) G itself is also symmetric as it is a spatial (and perhaps temporal)
correlation matrix.
2) PCA: "we basically try to find eigenvalues and eigenvectors of the
covariance matrix, C. We showed that C = (AA') / (n-1), and thus finding
the eigenvalues and eigenvectors of C is the same as finding the
eigenvalues and eigenvectors of AA'."
3) Further, from some algebra of SVD results it can be shown that from
A = U S V'
we can get:
AA' = U S^2 U'
Hence, in my notation scheme the PCA decomposition of GG' via SVD is:
GG' = U S^2 U'
and the corresponding SVD of G is:
G = U S V'
but as U'=V' I can substitute:
G = U S U'
Hence, I can compute G from the SVD of GG'. This, unfortunately leads
to the conclusion that there is a bug in the code, which can be
verified, and fixed, in the sample code:
rm(list=ls())
# Make this repeatable for testing
set.seed(12003)
# Generate a dummy set of 1000 months of rain at 10 measurement stations
tsLength <- 1000;
nStations <- 10;
syntheticRain <- runif(tsLength * nStations, 0, 10)
syntheticRain <- matrix(syntheticRain, nrow = tsLength, ncol = nStations)
# Normalise the rainfall before further processing as is standard
practice in
# statistical hydrology. NB: this step should not matter though for the
method
# as the correlation will eliminate the scale and means.
syntheticRain <- scale(syntheticRain)
# Compute the spatial correlation of the stations
spatialCor <- cor(syntheticRain)
# Our dummy GGt:
GGt <- spatialCor %*% t(spatialCor)
# Now attempt the back calculation to verify
svdGGT <- svd(GGt)
spatialCorfromSVD1 <- svdGGT$u %*% diag(sqrt(svdGGT$d))
spatialCorfromSVD2 <- svdGGT$u %*% diag(sqrt(svdGGT$d)) %*% t(svdGGT$u)
Thanks very much all
-pete
--
Peter Brady
Email: pdbrady at ans.com.au
Skype: pbrady77
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 881 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20140815/82e27245/attachment.bin>
More information about the R-help
mailing list