[Rd] probable bugs in stats::loglin calculation of pearson chisq

Michael Friendly friendly at yorku.ca
Tue Jul 9 20:29:21 CEST 2013

In running the following example of a loglinear model for the Titanic data,
I was surprised to see NaN reported for the
Pearson chisq

> loglin(Titanic, margin=list(1:3, 4))
2 iterations: deviation 2.273737e-13
\$lrt
 671.9622

\$pearson
 NaN

\$df
 15

\$margin
\$margin[]
 "Class" "Sex"   "Age"

\$margin[]
 "Survived"

Tracing it back, this occurs because there are zeros in the
fitted/expected frequencies
for children among the Crew.

> # get fitted (expected) values
> fitted <- loglin(Titanic, margin=list(1:3, 4), fit=TRUE)\$fit
2 iterations: deviation 2.273737e-13
> fitted[Class="Crew",,Age="Child",]
Survived
Sex      No Yes
Male    0   0
Female  0   0

I certainly understand the difference between sampling
zeros and structural zeros, and this distinction seems properly
implemented in loglin()
via the start= argument, but only in the calculation of Pearson chisq,
not for LRT.  I think this is a code
bug, but if there is a reason for the difference, it should be
documented in the help for
loglin.

Another probable bug is that the calculation of of the LRT chisq also
takes zero fitted values
into account, while the calculation of the Pearson chisq does not, and
result for my example.
It occurs in the following portion of the code for loglin:

fit <- z\$fit
attributes(fit) <- attributes(table)
observed <- as.vector(table[start > 0])
expected <- as.vector(fit[start > 0])
pearson <- sum((observed - expected)^2/expected)
observed <- as.vector(table[table * fit > 0])
expected <- as.vector(fit[table * fit > 0])
lrt <- 2 * sum(observed * log(observed/expected))

I don't understand the reasons for the different calculations of
observed & expected
for pearson & lrt.

FWIW, below is how I calculate these in my mosaics.sas program

start chisq(obs, fit);
*-- Find Pearson and likelihood ratio chisquares;
gf = sum ( (obs - fit)##2 / ( fit + (fit=0) ) );
lr = 2 # sum ( obs # log ( (obs+(obs=0)) / (fit + (fit=0)) ) );
return (gf // lr);
finish;

-Michael

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca