[R] Comparing slopes in two linear models

Thu Feb 12 13:00:36 CET 2009

Hi everyone,

I have a data frame (d), wich has the results of mosquitoes trapping in
three different places.
I suspect that one of these places (Local=='Palm') is biased by low
numbers and will yield slower slopes in the variance-mean regression over
the areas. I wonder if these slopes are diferents.

I've looked trought the support list for methods for comparing slopes and
found the three explained above:

#d - whole datafrmae
d$Local <- as.factor(d$Local)
dHT <- subset(d, Local!='Palm')
dP <- subset(d, Local!='Palm')

1 - Single dataframe procedure:

fit1 <- lm(log(varCDI) ~ Local + I(log(CDI)) + Local:I(log(CDI)), data=d)
fit2 <- lm(log(varCDI) ~ Local + I(log(CDI)) + Local:I(log(CDI)),
data=dHT)
a1 <- anova(fit1)
a2 <- anova(fit2)

a1
Local:I(log(CDI))   2   1.183   0.592    7.8741 0.0005574 ***
Residuals         152  11.421   0.075

a2
Local:I(log(CDI))   1  0.061   0.061   0.8156 0.36859
Residuals         102  7.643   0.075

>>>> Concluding that the places are different

2- proposed by Spencer Graves:

Spencer <- function(df1, df2){
 fit1 <- lm( log(df1$varCDI)~log(df1$CDI) )
 fit2 <- lm( log(df2$varCDI)~log(df2$CDI) )
 s1 <- summary(fit1)$coefficients
 s2 <- summary(fit2)$coefficients
 db <- (s2[2,1]-s1[2,1])

#### TEST DEGREES OF FREEDOM
 df <- (fit1$df.residual+fit2$df.residual)

### Pooled Variance ####

 sd <- sqrt(s2[2,2]^2+s1[2,2]^2)

 td <- db/sd
 2*pt(-abs(td), df)
}
Spencer(dP, dHT)
[1] 0.0001717730

3- I wrote one based on Zar (1999) and Kleinbaum (1987) procedure:

Zar <- function(df1, df2){
 fit1 <- lm( log(df1$varCDI)~log(df1$CDI) )
 fit2 <- lm( log(df2$varCDI)~log(df2$CDI) )
 s1 <- summary(fit1)$coefficients
 s2 <- summary(fit2)$coefficients
 db <- (s1[2,1]-s2[2,1])

#### TEST DEGREES OF FREEDOM
 n1 <- fit1$df.residual
 n2 <- fit2$df.residual
 n <- n1 + n2

### Pooled Variance ####

 ssx1 <- var(df1$CDI) * (n1+1)
 ssx2 <- var(df2$CDI) * (n2+1)
 RSS1 <- anova(fit1)[2,2]
 RSS2 <- anova(fit2)[2,2]
 sspvm <- ( RSS1 + RSS2 ) / n
 sd1_2 <- sqrt( sspvm/ssx1 + sspvm/ssx2 )

 td <- db/sd1_2
 2*pt(-abs(td), n)
}
Zar(dP, dHT)
[1] 0.4040566

Is there anything wrong with the function I wrote?
Assuming that there is nothing wrong with the function I wrote, I'd like
to know why R dos not follow this procedure.
Why does the variation over the X-axis is not considered for the pooled
variance?
Perhaps a simpler question: how is calculated the coeficient standart
error wich is printed by 'summary'?

Thanks,
Arthur