[R] Creating contingency table from mixed data
(Ted Harding)
ted.harding at nessie.mcc.ac.uk
Sun May 6 10:48:14 CEST 2007
On 05-May-07 23:14:38, spime wrote:
>
> Hi,
>
> I am new in R. Please help me in the following case.
>
> I have data in hand:
> http://www.nabble.com/file/8225/Data.txt Data.txt
>
> There are some categorical (binary and nominal) and continuous
> variables.
>
> How can i get a generic RXC contingency table from this table? My main
> objective is to fine count in each cell and mean of continuous
> variables in
> each cell.
>
> Please reply.
>
> Thanks in advance
If what is in that file is all your data, then it is easily and
quite wuickly (10 minutes) done by hand, facilitated by first
re-ordering your data as:
Var1 Var2 Var3 Var4 Var5
0 11 1 0 144
0 17 1 1 123
0 15 1 1 117
0 18 2 0 99
0 22 2 1 142
1 17 1 0 136
1 10 1 1 109
1 8 2 1 133
1 17 2 1 108
1 11 3 0 112
1 16 3 0 121
1 12 3 1 152
>From which, the following is easy to obtain:
Var3:
---------------------------------
Var1:0 | 1 | 2 | 3 |
=====================================================
Var4:0 | (11,144) | (18, 99) | |
| | | |
-----------------------------------------------------
Count: | 1 | 1 | 0 |
Mean: | (11,144) | (18. 99) | |
=====================================================
Var4:1 | (17,123) | (22,142) | |
| (15,117) | | |
-----------------------------------------------------
Count: | 2 | 1 | 0 |
Mean: | (16,120) | (22,142) | |
=====================================================
Var3:
---------------------------------
Var1:1 | 1 | 2 | 3 |
=====================================================
Var4:0 | (17,136) | | (11,112) |
| | | (16,121) |
-----------------------------------------------------
Count: | 1 | 0 | 2 |
Mean: | (17,136) | | (13.5,116.5) |
=====================================================
Var4:1 | (10,109) | ( 8,133) | (12,152) |
| | (17,108) | |
-----------------------------------------------------
Count: | 1 | 2 | 1 |
Mean: | (10,109) | (12.5,120.5) | (12,152) |
=====================================================
To do it automatically, you could get the counts alone by
applying table() to the "factor" columns (vars 1, 2, 4, taken
all together). Thus (where "Dat" is a dataframe with columns
Var1,...,Var5):
> table(Dat$Var4,Dat$Var3,Dat$Var1,dnn=c("Var4","Var3","Var1"))
, , Var1 = 0
Var3
Var4 1 2 3
0 1 1 0
1 2 1 0
, , Var1 = 1
Var3
Var4 1 2 3
0 1 0 2
1 1 2 1
which is basicaloy a contingency table format already,
or counts and means by() with functions sun() and mean() to the
"continuous" variables, thus:
CT <- by(Dat,list(var1=Dat$Var1,Var3=Dat$Var3,Var4=Dat$Var4),
function(x){list(Count=sum(x[,2]>0),Mean=mean(x[,c(2,5)]))})
which produces:
var1: 0
Var3: 1
Var4: 0
$Count
[1] 1
$Mean
Var2 Var5
11 144
------------------------------------------------------------
var1: 1
Var3: 1
Var4: 0
$Count
[1] 1
$Mean
Var2 Var5
17 136
------------------------------------------------------------
var1: 0
Var3: 2
Var4: 0
$Count
[1] 1
$Mean
Var2 Var5
18 99
------------------------------------------------------------
var1: 1
Var3: 2
Var4: 0
NULL
------------------------------------------------------------
var1: 0
Var3: 3
Var4: 0
NULL
------------------------------------------------------------
var1: 1
Var3: 3
Var4: 0
$Count
[1] 2
$Mean
Var2 Var5
13.5 116.5
------------------------------------------------------------
var1: 0
Var3: 1
Var4: 1
$Count
[1] 2
$Mean
Var2 Var5
16 120
------------------------------------------------------------
var1: 1
Var3: 1
Var4: 1
$Count
[1] 1
$Mean
Var2 Var5
10 109
------------------------------------------------------------
var1: 0
Var3: 2
Var4: 1
$Count
[1] 1
$Mean
Var2 Var5
22 142
------------------------------------------------------------
var1: 1
Var3: 2
Var4: 1
$Count
[1] 2
$Mean
Var2 Var5
12.5 120.5
------------------------------------------------------------
var1: 0
Var3: 3
Var4: 1
NULL
------------------------------------------------------------
var1: 1
Var3: 3
Var4: 1
$Count
[1] 1
$Mean
Var2 Var5
12 152
but this format is not very convenient for incorporating into
a contingency table such as the one shown above (obtained by hand).
Probably others can find a way to convert the above output from
CT into a contingency table.
However, unless you have a lot of these to do, it may be quicker
to do one, or a few, by hand!
Hoping this helps,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-May-07 Time: 02:57:12
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list