# [R] Summarize data for MCA (FactoMineR)

Nelson Castillo nelsoneci at gmail.com
Sat Apr 26 01:55:27 CEST 2008

```Hi :-)

I'm new to R and I started using it for a project (I'm the CS guy in a group
of statisticians helping them find out how to solve issues as they come out).
This is my first post to the list and I am starting to learn R.

Well, they were used to doing MCA analysis in other programs where the data
seems to be preprocessed automatically before running MCA.

So, they need to process a data set that comes with N=1000000 of elements,
but there are really about N/100 distinct elements over all the variables, so
the MCA can be run in reasonable time summarizing data.

So, the question is:

How can I turn x from:

x <-
structure(list(weight = c(1, 1, 2, 1, 2), var1 = structure(c(1L,
1L, 1L, 1L, 2L), .Label = c("A", "C"), class = "factor"), var2 =
structure(c(1L,
1L, 1L, 1L, 2L), .Label = c("B", "D"), class = "factor")), .Names = c("weight",
"var1", "var2"), row.names = c(NA, 5L), class = "data.frame")

to:

y <-
structure(list(weihgt = c(5L, 2L), var1 = structure(1:2, .Label = c("A",
"C"), class = "factor"), var2 = structure(1:2, .Label = c("B",
"D"), class = "factor")), .Names = c("weihgt", "var1", "var2"
), class = "data.frame", row.names = c(NA, -2L))

using R?

That is, from:

> x
weight var1 var2
1      1    A    B
2      1    A    B
3      2    A    B
4      1    A    B
5      2    C    D

to:

> y
weihgt var1 var2
1      5    A    B
2      2    C    D

The idea is that there is one occurrence of "A B" repeated 4 times in
the original table,
and it is summarized in the second table, computing the sum of the weights.

I solved the problem using Perl, but I'd like to know what I have to