[R] 3-way contingency table

Mathias Walter mathias.walter at googlemail.com
Mon May 2 16:03:45 CEST 2011


Hi David,

thanks for your quick response. It was really helpful.

--
Kind regards,
Mathias

2011/4/29 David Winsemius <dwinsemius at comcast.net>:
>
> On Apr 29, 2011, at 6:47 AM, Mathias Walter wrote:
>
>> Hi,
>>
>> I have large data frame with many columns. A short example is given below:
>>
>>> dataH
>>
>>   host ms01 ms31 ms33 ms34
>> 1  cattle    4   20    9    6
>> 2   sheep    4    3    4    5
>> 3  cattle    4    3    4    5
>> 4  cattle    4    3    4    5
>> 5   sheep    4    3    5    5
>> 6    goat    4    3    4    5
>> 7   sheep    4    3    5    5
>> 8    goat    4    3    4    5
>> 9    goat    4    3    4    5
>> 10 cattle    4    3    4    5
>>
>> Now I want to determine the the frequencies of every unique value in
>> every column depending on the host column.
>>
>> It is quite easy to determine the frequencies in total with the
>> following command:
>>
>>> dataH2 <- dataH[,c(2,3,4,5)]
>>> table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA="ifany")
>>
>>   ms01 ms31 ms33 ms34
>> 3     0    9    0    0
>> 4    10    0    7    0
>> 5     0    0    2    9
>> 6     0    0    0    1
>> 9     0    0    1    0
>> 20    0    1    0    0
>>
>> But I cannot manage to get it dependent on the host.
>>
>> I tried
>>
>>> xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)
>>
>> and many other ways but I'm not stressful.
>>
>> I can get it for each column individually with
>>
>>> with(dataH, table(host, ms33))
>>
>>      ms33
>> host     4 5 9
>> cattle 3 0 1
>> deer   0 0 0
>> goat   3 0 0
>> human  0 0 0
>> sheep  1 2 0
>> tick   0 0 0
>>
>> But I do not want to repeat the command for every column. I need a
>> single table which can be plotted as a balloon plot, for instance.
>
> You have obviously not given us the full data from which your "correct
> answer" was drawn, but see if this is going  the right direction:
>
> require(reshape)
>> dataHm <- melt(dataH)
> Using host as id variables
>> xtabs(~host+value+variable, dataHm)
> , , variable = ms01
>
>        value
> host     3 4 5 6 9 20
>  cattle 0 4 0 0 0  0
>  goat   0 3 0 0 0  0
>  sheep  0 3 0 0 0  0
>
> , , variable = ms31
>
>        value
> host     3 4 5 6 9 20
>  cattle 3 0 0 0 0  1
>  goat   3 0 0 0 0  0
>  sheep  3 0 0 0 0  0
>
> , , variable = ms33
>
>        value
> host     3 4 5 6 9 20
>  cattle 0 3 0 0 1  0
>  goat   0 3 0 0 0  0
>  sheep  0 1 2 0 0  0
>
> , , variable = ms34
>
>        value
> host     3 4 5 6 9 20
>  cattle 0 0 3 1 0  0
>  goat   0 0 3 0 0  0
>  sheep  0 0 3 0 0  0
>
>>
>> Does anybody knows how to achieve this?
>>
>> --
>> Kind regards,
>> Mathias
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
>



More information about the R-help mailing list