[R] Tetrachoric and polychoric ceofficients (for sem) - any tips?
John Fox
jfox at mcmaster.ca
Tue Nov 30 03:25:34 CET 2004
Dear Michael,
I had some time this evening, so I programmed the two-step procedure
described in the article that I mentioned. This isn't the ML estimate of the
correlation, but apparently it performs reasonably well and it is quite
fast. I checked the function on some examples and it appears to work
correctly. I'd be curious to see how the this function compares with the one
that you received.
A more complete solution would take a data frame and, as appropriate,
calculate polychoric, polyserial, and product-moment correlations among its
columns. I don't think that writing a polyserial-correlation function would
be any more difficult. Perhaps I'll add this to the sem package; I hesitate
to do so because the resulting standard errors and likelihoods from sem()
won't be right.
I'm taking the liberty of copying this reply to the r-help list, since the
question was originally raised there. I hope that's OK.
Regards,
John
-------------- snip ------------
polychor <- function (x, y){
# x: a contingency table of counts or an ordered categorical variable
# y: if x is a variable, a second ordered categorical variable
# returns the polychoric correlation for the table
# or between x and y, using the two-step approximation
# to the ML estimate described in F. Drasgow, "Polychoric and
# polyserial correlations," in S. Kotz and N. Johnson, eds.
# The Encyclopedia of Statistics, Volume 7, New York: Wiley,
# 1986.
f <- function(rho) {
P <- matrix(0, r, c)
R <- matrix(c(1, rho, rho, 1), 2, 2)
for (i in 1:r){
for (j in 1:c){
P[i,j] <- pmvnorm(lower=c(row.cuts[i], col.cuts[j]),
upper=c(row.cuts[i+1], col.cuts[j+1]),
corr=R)
}
}
- sum(tab * log(P))
}
tab <- if (missing(y)) x else table(x, y)
r <- nrow(tab)
c <- ncol(tab)
n <- sum(tab)
row.cuts <- c(-Inf, qnorm(cumsum(rowSums(tab))/n))
col.cuts <- c(-Inf, qnorm(cumsum(colSums(tab))/n))
optimise(f, interval=c(0, 1))$minimum
}
> -----Original Message-----
> From: Michael Dewey [mailto:m.dewey at iop.kcl.ac.uk]
> Sent: Monday, November 29, 2004 1:38 PM
> To: John Fox
> Subject: RE: [R] Tetrachoric and polychoric ceofficients (for
> sem) - any tips?
>
> At 18:07 28/11/04, you wrote:
> >Dear Michael,
> >
> >I'm not aware of pre-existing R code for tetrachoric or polychoric
> >correlations. I may at some point incorporate such functions
> into the
> >sem package but I don't have concrete plans for doing so. On
> the other
> >hand, I don't think that it would be very hard to do so. (A
> discussion,
> >references, and an example are in Kotz and Johnson, eds.,
> Encyclopedia
> >of Statistics, Vol 7.)
>
> Someone has kindly emailed me some code for the polychoric
> case (which he says is quite slow) I will try it out, it may
> be better to treat the 2 by 2 case separately. If it is OK I
> plan to write an equivalent to cor for matrices of
> tetrachoric/polychoric. It will not happen any time soon now,
> but if I do manage t would you be interested in it for sem?
> As you say there remains the problem that they are estimates
> and introduce further uncertainties.
>
>
> >Tetrachoric and polychoric correlations are estimates, and in
> >subsequently estimating a SEM from these, one should take
> that into account.
>
> Michael Dewey
> m.dewey at iop.kcl.ac.uk
>
More information about the R-help
mailing list