[R] polychoric correlation error

John Fox jfox at mcmaster.ca
Sat Aug 5 04:48:16 CEST 2006


Dear Janet,

Because you didn't set the value of the random-number generator seed, your
example isn't precisely reproducible, but the problem is apparent anyway:

> set.seed(12345)
> n<-100
> test.x<-rnorm(n, mean=0, sd=1)
> test.c<-test.x + rnorm(n, mean=0, sd=.5) 
> thresh.x<-c(-2.5, -1, -.5, .5, 1000) 
> thresh.c<-c(-1, 1, 2, 3, 1000)
> 
> discrete.x<-discrete.c<-vector(length=n)
> 
> for (i in 1:n) {
+ discrete.x[i]<-which.min(thresh.x < test.x[i] )
+ discrete.c[i]<-which.min(thresh.c < test.c[i] ) }
> 
> table(discrete.x, discrete.c)
          discrete.c
discrete.x  1  2  3  4  5
         2 12  1  0  0  0
         3  3 12  0  0  0
         4  2 19  2  0  0
         5  0 18 21  9  1
> 
> cor(test.x, test.c)
[1] 0.9184189
> 
> pc <- polychor(discrete.x, discrete.c, std.err=T, ML=T)
Warning messages:
1: NaNs produced in: log(x) 
2: NaNs produced in: log(x) 
3: NaNs produced in: log(x) 
> pc

Polychoric Correlation, ML est. = 0.9077 (0.03314)
Test of bivariate normality: Chisquare = 3.103, df = 11, p = 0.9893

  Row Thresholds
  Threshold Std.Err.
1  -1.12200   0.1609
2  -0.56350   0.1309
3   0.03318   0.1235


  Column Thresholds
  Threshold Std.Err.
1   -0.9389   0.1489
2    0.4397   0.1292
3    1.2790   0.1707
4    2.3200   0.3715
> 

The variables that you've created are indeed bivariate normal, but they are
highly correlated, and your choice of cut points makes it hard to estimate
the correlation from the contingency tables, apparently producing some
difficulty in the maximization of the likelihood. Nevertheless, the ML
estimates of the correlation and thresholds for the set of data above are
pretty good. (In your case, the optimization failed.)

BTW, a more straightforward way to create the categorical variables would be

discrete.x <- cut(test.x, c(-Inf, -2.5, -1, -.5, .5, Inf))
discrete.c <- cut(test.c, c(-Inf, -1, 1, 2, 3, Inf))

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of 
> Rosenbaum, Janet
> Sent: Friday, August 04, 2006 5:49 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] polychoric correlation error
> 
> 
> Dear all,
> 
> I get a strange error when I find polychoric correlations 
> with the ML method, which I have been able to reproduce using 
> randomly-generated data.
> 
> What is wrong?  
> I realize that the data that I generated randomly is a bit 
> strange, but it is the only way that I duplicate the error message.
> 
> 
> > n<-100
> > test.x<-rnorm(n, mean=0, sd=1)
> > test.c<-test.x + rnorm(n, mean=0, sd=.5) thresh.x<-c(-2.5, -1, -.5, 
> > .5, 1000) thresh.c<-c(-1, 1, 2, 3, 1000)
> > 
> > discrete.x<-discrete.c<-vector(length=n)
> > 
> > for (i in 1:n) {
> + 	discrete.x[i]<-which.min(thresh.x < test.x[i] )
> + 	discrete.c[i]<-which.min(thresh.c < test.c[i] ) }
> > pc<-polychor(discrete.x, discrete.c, std.err=T, ML=T)
> Error in optim(c(optimise(f, interval = c(-1, 1))$minimum, 
> rc, cc), f,  : 
> 	non-finite finite-difference value [1]
> In addition: There were 50 or more warnings (use warnings() 
> to see the first 50)
> > print(pc)
> Error in print(pc) : object "pc" not found
> > warnings()
> Warning messages:
> 1: NaNs produced in: log(x)
> 2: NA/Inf replaced by maximum positive value
> 3: NaNs produced in: log(x) 
> 
> 
> ---
> 
> Thanks,
> 
> Janet
> 
> --------------------
> 
> This email message is for the sole use of the intended\ > ...{{dropped}}



More information about the R-help mailing list