[R] Lack of independence in anova()

Phillip Good pigood at verizon.net
Mon Jul 4 17:28:45 CEST 2005


 If the observations are normally distributed and the 2xk design is
balanced,  theory requires that the tests for interaction and row effects be
independent.  In my program, appended below, this would translate to cntT
(approx)= cntR*cntI/N if all R routines were functioning correctly.  They
aren't.

sim2=function(size,N,p){
  cntR=0
  cntC=0
  cntI=0
  cntT=0
  cntP=0
  for(i in 1:N){
    #generate data
     v=gendata(size)
    #analyze after build(ing) design containing data
     lm.out=lm(yield~c*r,build(size,v))
     av.out=anova(lm.out)
    #if column effect is significant, increment cntC
     if (av.out[[5]][1]<=p)cntC=cntC+1
   #if row effect is significant, increment cntR
     if (av.out[[5]][2]<=p){
           cntR=cntR+1
	   tmp = 1
	   }
     else tmp =0 		
     if (av.out[[5]][3]<=p){
           #if interaction is significant, increment cntI
            cntI=cntI+1
	#if both interaction and row effect are significant, increment cntT
            cntT=cntT + tmp
            } 
     }
    list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
}

build=function(size,v){
#size is a vector containing the sample sizes
col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
+size[7]+size[8]))
return(data.frame(c=factor(col), r=factor(row),yield=v))
}

gendata=function(size){
  ssize=sum(size);
  return (rnorm(ssize))
}

#Example
 size=c(3,3,3,0,3,3,3,0)
 sim2(size,10000,10,.16)



Phillip Good
Huntington Beach CA



More information about the R-help mailing list