[R] making table() work

Wed Apr 27 09:03:47 CEST 2005

I belive that the problem is not with the table, but with your predictions 
which are not 0s and 1s.

Ales Ziberna
----- Original Message ----- 
From: "Stephen Choularton" <mail at bymouth.com>
To: "'R Help'" <r-help at stat.math.ethz.ch>
Sent: Wednesday, April 27, 2005 4:19 AM
Subject: [R] making table() work

I am trying to do some verification across a large dataset, cuData, that
has 23 columns.

Column 23 (similarity) is the outcome 0 or 1 and the other columns are
the features.

I do this:

verificationglm.model <- glm(formula = similarity ~ ., family=binomial,
data=cuData[1:1000,])

and produce the model:

> summary(verificationglm.model)

Call:
glm(formula = similarity ~ ., family = binomial, data = cuData[1:1000,
    ])

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-2.3885  -0.8943  -0.2918   0.8851   2.7025

Coefficients:
                        Estimate Std. Error z value Pr(>|z|)
(Intercept)           26.3112869 21.2229690   1.240 0.215066
length                -0.6249415  0.1906254  -3.278 0.001044 **
meanPitch             -0.0110389  0.0053083  -2.080 0.037565 *
minimumPitch           0.0002689  0.0024290   0.111 0.911845
maximumPitch          -0.0013454  0.0038149  -0.353 0.724326
meanF1                -0.0362153  0.0112499  -3.219 0.001286 **
meanF2                 0.0016765  0.0115335   0.145 0.884430
meanF3                 0.0073960  0.0076235   0.970 0.331964
meanF4                 0.0063015  0.0016820   3.746 0.000179 ***
meanF5                -0.0022535  0.0024885  -0.906 0.365153
ratioF2ToF1           -1.2322825  7.0036532  -0.176 0.860334
ratioF3ToF1           -4.9643148  4.5973552  -1.080 0.280222
jitter                -8.7535283 14.5273818  -0.603 0.546806
shimmer                1.6706067  2.6327972   0.635 0.525731
percentUnvoicedFrames -0.4863219  1.1638115  -0.418 0.676042
numberOfVoiceBreaks   -0.0335636  0.0634956  -0.529 0.597086
percentOfVoiceBreaks  -2.9353239  0.8945600  -3.281 0.001033 **
meanIntensity         -0.2931293  0.3355314  -0.874 0.382321
minimumIntensity       0.0689654  0.1531059   0.450 0.652392
maximumIntensity       0.2186570  0.2510906   0.871 0.383848
ratioIntensity        -8.1777871 13.1676287  -0.621 0.534565
noSyllsIntensity       0.1714826  0.0695021   2.467 0.013614 *
speakingRate          -0.3564808  0.1507373  -2.365 0.018034 *
startSpeech           -1.3537348  6.7337461  -0.201 0.840669
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1384.0  on 999  degrees of freedom
Residual deviance: 1084.7  on 976  degrees of freedom
AIC: 1132.7

Number of Fisher Scoring iterations: 5

>

Now I want to use the model to predict on a different part of the
dataset.

I try this, and get my prediction:

> pred <- predict(verificationglm.model, cuData[1001:2000,1:23])
> pred
        1001         1002         1003         1004         1005
1006         1007
-0.495901722 -2.406349629 -0.911082179 -0.965869553 -0.488695693
-1.849622304 -1.637722247
        1008         1009         1010         1011         1012
1013         1014
-1.148952722 -0.191538278 -1.511895046 -2.989036645 -2.775775622
0.603852124 -0.838613048
        1015         1016         1017         1018         1019
1020         1021
-0.434259674 -2.004230065 -0.234829011  1.666502334  2.039631718
-0.592192326  1.667700087
        1022         1023         1024         1025         1026
1027         1028
 0.104644531  1.748724399  0.391461247  1.356898357  1.468154760
1.090708994  1.071487227
        1029         1030         1031         1032         1033
1034         1035
 0.720596788  2.378350706 -0.128248232  0.969373318  0.315142756
1.372108172 -2.399517898
        1036         1037         1038         1039         1040
1041         1042
-0.684530171  0.761198819 -1.298372615  1.185368711 -1.148974059
0.358234433  0.671495255
        1043         1044         1045         1046         1047
1048         1049
 0.683771224  0.663767266  2.009012643  0.196591464  2.063417812
0.823472345  0.696638161

[runs on to 2000]

However, I then want to check for classAgreement (an e1701 package
function).  First I want a table. I do this:

> t = table(pred,cuData[1001:2000,24])
> t

pred                   0 1
  -8.90070098980106    0 1
  -8.0484071844879     0 1
  -7.79298548775523    1 0
  -7.18338330609013    1 0
[runs on]

when I expect this

0         1
0          ?    ?
1          ?    ?

with the ?’s being some count.  When I look at my slice of cuData it
looks like this:

> cuData[1001:2000,24]
   [1] 1 0 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1
1 1 1 1
  [38] 1 1 1 0 1 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 1 0 1 0 1 1 0 1 1 1 1 0 0
0 1 1 0
  [75] 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 0 1 0 1 1 0 0 0 1 1 1 0 0 0 1 0 1
0 0 0 0
 [112] 0 1 1 1 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
 [149] 1 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1
0 1 0 0
 [186] 0 1 0 0 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 0 1 1 0 0 0 0 0 0 1 1 1 1 1
1 1 0 1
 [223] 0 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 0 0 0 1 0 0 1 1 1 1 1
1 1 1 0
 [260] 1 0 0 0 0 1 1 1 0 1 1 1 1 0 1 0 0 1 0 0 1 1 1 0 1 1 1 1 1 1 0 1 1
1 1 1 0
 [297] 1 1 0 1 1 1 1 1 1 0 1 1 0 0 1 0 0 1 1 0 1 1 1 0 1 1 1 1 1 0 0 1 1
1 1 1 0
 [334] 0 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 1 1 1 0 0 1 1 0
1 0 1 0
 [371] 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 1 1 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0
1 1 0 1
 [408] 1 1 0 0 0 0 1 0 1 1 1 1

[etc]

so it looks like a different layout from my pred. Does anyone know how
to make these two compatible so table() will work?

Thanks.

Stephen

-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.

[[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html