[R] different results in MASS's mca and SAS's corresp
Paul Johnson
pauljohn32 at gmail.com
Sun Feb 6 06:39:56 CET 2011
On Sat, Feb 5, 2011 at 9:19 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Feb 4, 2011, at 7:06 PM, Gong-Yi Liao wrote:
>
>> Dear list:
>>
>> I have tried MASS's mca function and SAS's PROC corresp on the
>> farms data (included in MASS, also used as mca's example), the
>> results are different:
>>
>> R: mca(farms)$rs:
>> 1 2
>> 1 0.059296637 0.0455871427
>> 2 0.043077902 -0.0354728795
>> 3 0.059834286 0.0730485572
>> 4 0.059834286 0.0730485572
[snip]
>>
>> And in SAS's corresp output:
>>
>> Row Coordinates
>>
>> Dim1 Dim2
>>
>> 1 1.0607 -0.8155
>> 2 0.7706 0.6346
>> 3 1.0703 -1.3067
>> 4 1.0703 -1.3067
>> 5 0.2308 0.9000
[snip]
>> Is MASS's mca developed with different definition to SAS's
>> corresp ?
>
> No, it's just that the values can only be defined up to a scaling factor
> (the same situation as with eigenvector decompostion). Take a look at the
> two dimensions, when each is put on the same scale:
>
>> cbind(scale(rmca$D1),scale(smca$Dim1) )
> [,1] [,2]
> [1,] 1.2824421 1.28242560
> [2,] 0.9316703 0.93168561
> [3,] 1.2940701 1.29403231
> [4,] 1.2940701 1.29403231
> [5,] 0.2789996 0.27905048
>> cbind(scale(rmca$D2),scale(smca$Dim2) )
> [,1] [,2]
> [1,] 1.06673426 -1.06677626
> [2,] -0.83006158 0.83012474
> [3,] 1.70932841 -1.70932351
> [4,] 1.70932841 -1.70932351
> [5,] -1.17729983 1.17729909
>
> David Winsemius, MD
> West Hartford, CT
When I came to David's comment, I understood the theory, but not the
numbers in his answer. I wanted to see the MASS mca answers "match
up" with SAS, and the example did not (yet).
Below see that if you scale the mca output, and then multiply column 1
of the scaled results by 0.827094, then you DO reproduce the SAS
column 1 results exactly. Just rescale item 1 in mca's first column
to match the SAS output. Repeat same with column 2, multiply
-0.7644828, and you reproduce column 2.
> rmca <- mca(farms)
> scalermca <- scale(rmca$rs)
> scalermca[1,]
1 2
1.282442 1.066734
> 1.0607/1.282442
[1] 0.827094
> -0.8155/1.06673426
[1] -0.7644828
> cbind(scalermca[,1] * 0.827094, scalermca[,2] * -0.7644828)
[,1] [,2]
1 1.06070017 -0.81549999
2 0.77057891 0.63456780
3 1.07031764 -1.30675217
4 1.07031764 -1.30675217
5 0.23075886 0.90002547
6 0.69488883 0.60993995
7 0.10530240 0.78445402
8 -0.27026650 0.44225049
9 0.13426089 1.15670532
10 0.11861965 0.64778456
11 0.23807570 1.21775202
12 1.01156703 -0.01927226
13 0.28051938 -0.59805897
14 -1.17343686 -0.27122981
15 -0.83838041 -0.64003061
16 -0.05453708 -0.22925816
17 -0.91732401 -0.49899374
18 -0.92694148 -0.00774156
19 -1.30251038 -0.34994509
20 -1.30251038 -0.34994509
So, that does reproduce SAS exactly. And I'm a little frustrated I
can't remember the matrix command to get that multiplication done
without cbinding the 2 columns together that way.
Question: I don't use mca, but to people who do, how are results
"supposed" to be scaled? Is there a "community accepted method" or is
every user on his/her own to fiddle up the numbers however?
--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
More information about the R-help
mailing list