[R] different results in MASS's mca and SAS's corresp

Sun Feb 6 06:39:56 CET 2011

On Sat, Feb 5, 2011 at 9:19 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Feb 4, 2011, at 7:06 PM, Gong-Yi Liao wrote:
>
>> Dear list:
>>
>>  I have tried MASS's mca function and SAS's PROC corresp on the
>>  farms data (included in MASS, also used as mca's example), the
>>  results are different:
>>
>>  R: mca(farms)$rs:
>>             1             2
>> 1   0.059296637  0.0455871427
>> 2   0.043077902 -0.0354728795
>> 3   0.059834286  0.0730485572
>> 4   0.059834286  0.0730485572
[snip]
>>
>>     And in SAS's corresp output:
>>
>>                               Row Coordinates
>>
>>                                       Dim1       Dim2
>>
>>                         1           1.0607    -0.8155
>>                         2           0.7706     0.6346
>>                         3           1.0703    -1.3067
>>                         4           1.0703    -1.3067
>>                         5           0.2308     0.9000
[snip]
>>       Is MASS's mca developed with different definition to SAS's
>>       corresp ?
>
> No, it's just that the values can only be defined up to a scaling factor
> (the same situation as with eigenvector decompostion). Take a look at the
> two dimensions, when each is put on the same scale:
>
>> cbind(scale(rmca$D1),scale(smca$Dim1) )
>            [,1]        [,2]
>  [1,]  1.2824421  1.28242560
>  [2,]  0.9316703  0.93168561
>  [3,]  1.2940701  1.29403231
>  [4,]  1.2940701  1.29403231
>  [5,]  0.2789996  0.27905048

>> cbind(scale(rmca$D2),scale(smca$Dim2) )
>             [,1]        [,2]
>  [1,]  1.06673426 -1.06677626
>  [2,] -0.83006158  0.83012474
>  [3,]  1.70932841 -1.70932351
>  [4,]  1.70932841 -1.70932351
>  [5,] -1.17729983  1.17729909
>
> David Winsemius, MD
> West Hartford, CT

When I came to David's comment, I understood the theory, but not the
numbers in his answer.  I wanted to see the MASS mca answers "match
up" with SAS, and the example did not (yet).

Below see that if you scale the mca output, and then multiply column 1
of the scaled results by 0.827094, then  you DO reproduce the SAS
column 1 results exactly.  Just rescale item 1 in mca's first column
to match the SAS output.  Repeat same with column 2, multiply
-0.7644828, and you reproduce column 2.

> rmca <- mca(farms)
> scalermca <- scale(rmca$rs)
> scalermca[1,]
       1        2
1.282442 1.066734
> 1.0607/1.282442
[1] 0.827094
> -0.8155/1.06673426
[1] -0.7644828
> cbind(scalermca[,1] * 0.827094, scalermca[,2] *  -0.7644828)
          [,1]        [,2]
1   1.06070017 -0.81549999
2   0.77057891  0.63456780
3   1.07031764 -1.30675217
4   1.07031764 -1.30675217
5   0.23075886  0.90002547
6   0.69488883  0.60993995
7   0.10530240  0.78445402
8  -0.27026650  0.44225049
9   0.13426089  1.15670532
10  0.11861965  0.64778456
11  0.23807570  1.21775202
12  1.01156703 -0.01927226
13  0.28051938 -0.59805897
14 -1.17343686 -0.27122981
15 -0.83838041 -0.64003061
16 -0.05453708 -0.22925816
17 -0.91732401 -0.49899374
18 -0.92694148 -0.00774156
19 -1.30251038 -0.34994509
20 -1.30251038 -0.34994509

So, that does reproduce SAS exactly.  And I'm a little frustrated I
can't remember the matrix command to get that multiplication done
without cbinding the 2 columns together that way.

Question: I don't use mca, but to people who do, how are results
"supposed" to be scaled?  Is there a "community accepted method" or is
every user on his/her own to fiddle up the numbers however?

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas