nearPD {Matrix}  R Documentation 
Compute the nearest positive definite matrix to an approximate one, typically a correlation or variancecovariance matrix.
nearPD(x, corr = FALSE, keepDiag = FALSE, do2eigen = TRUE, doSym = FALSE, doDykstra = TRUE, only.values = FALSE, ensureSymmetry = !isSymmetric(x), eig.tol = 1e06, conv.tol = 1e07, posd.tol = 1e08, maxit = 100, conv.norm.type = "I", trace = FALSE)
x 
numeric n * n approximately positive
definite matrix, typically an approximation to a correlation or
covariance matrix. If 
corr 
logical indicating if the matrix should be a correlation matrix. 
keepDiag 
logical, generalizing 
do2eigen 
logical indicating if a

doSym 
logical indicating if 
doDykstra 
logical indicating if Dykstra's correction should be used; true by default. If false, the algorithm is basically the direct fixpoint iteration Y(k) = P_U(P_S(Y(k1))). 
only.values 
logical; if 
ensureSymmetry 
logical; by default, 
eig.tol 
defines relative positiveness of eigenvalues compared to largest one, λ_1. Eigen values λ_k are treated as if zero when λ_k / λ_1 ≤ eig.tol. 
conv.tol 
convergence tolerance for Higham algorithm. 
posd.tol 
tolerance for enforcing positive definiteness (in the
final 
maxit 
maximum number of iterations allowed. 
conv.norm.type 
convergence norm type ( 
trace 
logical or integer specifying if convergence monitoring should be traced. 
This implements the algorithm of Higham (2002), and then (if
do2eigen
is true) forces positive definiteness using code from
posdefify
. The algorithm of Knol DL and ten
Berge (1989) (not implemented here) is more general in (1) that it
allows constraints to fix some rows (and columns) of the matrix and (2)
to force the smallest eigenvalue to have a certain value.
Note that setting corr = TRUE
just sets diag(.) < 1
within the algorithm.
Higham (2002) uses Dykstra's correction, but the version by Jens
Oehlschlaegel did not use it (accidentally), and has still lead to
good results; this simplification, now only via doDykstra = FALSE
,
was active in nearPD()
upto Matrix version 0.99937540.
If only.values = TRUE
, a numeric vector of eigen values of the
approximating matrix;
Otherwise, as by default, an S3 object of class
"nearPD"
, basically a list with components
mat 
a matrix of class 
eigenvalues 
numeric vector of eigen values of 
corr 
logical, just the argument 
normF 
the Frobenius norm ( 
iterations 
number of iterations needed. 
converged 
logical indicating if iterations converged. 
Jens Oehlschlaegel donated a first version. Subsequent changes by the Matrix package authors.
Cheng, Sheung Hun and Higham, Nick (1998) A Modified Cholesky Algorithm Based on a Symmetric Indefinite Factorization; SIAM J. Matrix Anal.\ Appl., 19, 1097–1110.
Knol DL, ten Berge JMF (1989) Leastsquares approximation of an improper correlation matrix by a proper one. Psychometrika 54, 53–61.
Higham, Nick (2002) Computing the nearest correlation matrix  a problem from finance; IMA Journal of Numerical Analysis 22, 329–343.
A first version of this (with nonoptional corr=TRUE
)
has been available as nearcor()
; and
more simple versions with a similar purpose
posdefify()
, both from package sfsmisc.
## Higham(2002), p.334f  simple example A < matrix(1, 3,3); A[1,3] < A[3,1] < 0 n.A < nearPD(A, corr=TRUE, do2eigen=FALSE) n.A[c("mat", "normF")] stopifnot(all.equal(n.A$mat[1,2], 0.760689917), all.equal(n.A$normF, 0.52779033, tolerance=1e9) ) set.seed(27) m < matrix(round(rnorm(25),2), 5, 5) m < m + t(m) diag(m) < pmax(0, diag(m)) + 1 (m < round(cov2cor(m), 2)) str(near.m < nearPD(m, trace = TRUE)) round(near.m$mat, 2) norm(m  near.m$mat) # 1.102 / 1.08 if(require("sfsmisc")) { m2 < posdefify(m) # a simpler approach norm(m  m2) # 1.185, i.e., slightly "less near" } round(nearPD(m, only.values=TRUE), 9) ## A longer example, extended from Jens' original, ## showing the effects of some of the options: pr < Matrix(c(1, 0.477, 0.644, 0.478, 0.651, 0.826, 0.477, 1, 0.516, 0.233, 0.682, 0.75, 0.644, 0.516, 1, 0.599, 0.581, 0.742, 0.478, 0.233, 0.599, 1, 0.741, 0.8, 0.651, 0.682, 0.581, 0.741, 1, 0.798, 0.826, 0.75, 0.742, 0.8, 0.798, 1), nrow = 6, ncol = 6) nc. < nearPD(pr, conv.tol = 1e7) # default nc.$iterations # 2 nc.1 < nearPD(pr, conv.tol = 1e7, corr = TRUE) nc.1$iterations # 11 / 12 (!) ncr < nearPD(pr, conv.tol = 1e15) str(ncr)# still 2 iterations ncr.1 < nearPD(pr, conv.tol = 1e15, corr = TRUE) ncr.1 $ iterations # 27 / 30 ! ncF < nearPD(pr, conv.tol = 1e15, conv.norm = "F") stopifnot(all.equal(ncr, ncF))# norm type does not matter at all in this example ## But indeed, the 'corr = TRUE' constraint did ensure a better solution; ## cov2cor() does not just fix it up equivalently : norm(pr  cov2cor(ncr$mat)) # = 0.09994 norm(pr  ncr.1$mat) # = 0.08746 / 0.08805 ### 3) a real data example from a 'systemfit' model (3 eq.): (load(system.file("external", "symW.rda", package="Matrix"))) # "symW" dim(symW) # 24 x 24 class(symW)# "dsCMatrix": sparse symmetric if(dev.interactive()) image(symW) EV < eigen(symW, only=TRUE)$values summary(EV) ## looking more closely {EV sorted decreasingly}: tail(EV)# all 6 are negative EV2 < eigen(sWpos < nearPD(symW)$mat, only=TRUE)$values stopifnot(EV2 > 0) if(require("sfsmisc")) { plot(pmax(1e3,EV), EV2, type="o", log="xy", xaxt="n",yaxt="n") eaxis(1); eaxis(2) } else plot(pmax(1e3,EV), EV2, type="o", log="xy") abline(0,1, col="red3",lty=2)