[R] comparing tables from replicated data

Troels Ring tring at gvdnet.dk
Tue Aug 25 08:47:59 CEST 2009


Dear friends, I'm examining the characteristics of two models that both 
fit the sodium concentration in 16 pigs quite well under treatment or 
control conditions. The more complicated model is by anova better than 
the less complicated model.  To take it further I have generated 
replicate data using the independent variables and parameter estimates 
under the two models. A clinically important criterion is the change in 
sodium concentration during the experiment, and as expected due to the 
character of the treatment this is larger in all the treated animals 
(n=10) as compared to the controls (n=6). This is also the case for 1000 
replicated sets under the more complex model while quite a few of 
misclassifications (control animal change > treated animal change) 
occurs under the less complex model. To understand (a bit at least) what 
goes on I have tried to see the observed data under random group 
assignment in the hope to be able to compare directly and formally the 
results from the replicates under the two models.
Here are the observed changes in the 16 pigs and grp1 is treated and 2 
is control.

grp <- as.factor(c(rep(1,10),rep(2,6)))
val <- c(6,12,11,11,11,13,15,13,11,11,2,3,1,1,1,2)
test <- sum(val[grp==1]<max(val[grp==2])) # 0

#Now under random perturbations of group assignments,
#what would occur???

TT <- NULL
for (i in 1:1000){
ind <- sample(c(1:16),16,replace=FALSE)
grp1 <- grp[ind]
TT[i] <- sum(val1[grp1==1]<max(val1[grp1==2]))
}
hist(TT)
table(TT)
TT
   0    1    2    3    4    5    6    7    8    9   10
   2    5  126  407  408  117    9 1026 3171  879 3850

For the less complex model, the results on 1000 replicates are 
"evidently" better than the TT default
table(test11b)
test11b
  0   1   2   3   4   5   6   7   8   9  10
279 294 191 114  53  40  17   6   3   2   1

and for the more advanced model I get even more convincing
table(test11d)
test11d
   0
1000

Clinically I can say that it is bad to have 1 in 16 misclassified and 
therefore judge the complicated model better, but others might disagree. 
Also it is not too good that the method here is insensitive to the size 
of the changes.

I hope some of you will have remarks on this problem.

Best wishes
Troels

-- 

Troels Ring - -
Department of nephrology - - 
Aalborg Hospital 9100 Aalborg, Denmark - -
+45 99326629 - -
tring at gvdnet.dk




More information about the R-help mailing list