# [R] Discrete Multivariate Analysis (log-linear model)

Jorge Magalhães jmagalhaes at oninetspeed.pt
Wed Apr 16 16:04:24 CEST 2003

```I'm reading a old statistics book "MARVIN J. Karson (1982), Multivariate
Statistical Methods, The IOWA State University Press, Iowa". In the chapter
XI i can find some information about discrete multivariate analysis. This
chapter is restricted to an introduction to log-linear models for analysis of
multidimensional contingency tables. For example, in the log-linear model for
the 3-way table e can test several Hypothesis:

Note: MLE, maximum likelihood estimator
Mijk: counts in the cell i, j, k

Hypothesis #		Hypothesis		df				MLE of Mijk
1				(ABC)			(a-1)(b-1)(c-1)		----
2				(AB)(ABC)			(a-1)(b-1)c			mi+k m+jk/m++k
3				(AC)(ABC)			(a-1)(c-1)b			mij+m+jk/m+j+
.....
8				(AB)(AC)(BC)(ABC)	abc-a-b-c+2		mi++m+j+m++k/n²
..........
18

for better understanding it, MARVIN gives an example:

>From MARVIN (p.269):

"Table X presents data from a national sample of n = 10524 respondents
classified according to income A; mobility, B; and educational C. Each
variable was categorized in two classes, with income classified as low
<\$12,500 and high otherwise, mobility classified as mobile if the respondent
has made one or more moves over the last five years and nonmobile otherwise,
and education classified as high scholl graduate or under versus some college
or above.
Table X

Mobile (j=1)				Nonmobile(j=2)
High School		College		High School 		College
k = 1			k=2				k=1			k=2
Low income (i=1)	1137			1091				2160			886
High income (i=2)	547			1415				1363			1925

M(111) = 1137
M(121) = 2160
and so on....

The satured model is:

Lijk = ln Mijk
Lijk =  u + u(A)(i) + u(B)(j) + u(AB)(ij) + u(AC)(ik) + u(BC)(jk) +u(ABC)(ijk)

For example, the independence hypothesis 8, states that income, mobility, and
education are independent variables when grouped according to the given cross
classification of low or high, mobile or nonmobile, and high or college.
.....". end of citation

My main questios is: how i can perform similar analysis in R environment. I
want to test the hypothesis 8 and the all others. For each hypothesis, i want
calculate de X^2 and G^2 and select the best model for fit the data
moderately well.

Note: G^2 = 2 SUM (mijk ln(mijk/Mijk))
Note: X^2 = SUM((mijk-E(Mijk))^2/E(Mijk))

Jorge Magalhães

```