[R] Logistic Regression with genetic component

Ben Bolker bbolker at gmail.com
Sun Dec 4 16:50:41 CET 2011


Danielle Duncan <dlduncan2 <at> alaska.edu> writes:

> Greetings, I have a question that I'd like to get input on. I have a
> classic toxicology study where I artificially fertilized and exposed
> embryos to a chemical and counted defects. In addition, I kept track of
> male-female pairs that I used to artificially fertilize and generate
> embryos with. I need to use logistic regression to model the response, but
> also check that the genetics of the pairings did or did not have an effect
> on the response. My data looks a bit like this:
> 
> response matrix chemical concentration      Genetic Matrix
> Present Absent                Male Female
> 2         152                       0.13 a 1
> 6         121                          1 a 2
> 21         92                          2 b 3
> 24         89                          5 b 4
> 0        141                         10 c 5
> 5         95                         15 c      6
> 
> R code:
> 
> DA<-cbind(Present, Absent)
> glm<-(DA ~ chemical concentration)
> 
> If I do glm<-(DA ~ chemical concentration + Male + Female, I get every
> possible combination, but I only want specific pairs. So, I am thinking
> about doing:
> 
> MF<-cbind(Male, Female)
> glm<-(DA ~ chemical concentration + MF)


You're on the right track.  paste() is probably what
you want, although you can also use interaction() to
get the interactions and then droplevels() to get
rid of the unobserved crosses.

d <- read.table(textConnection("
Present Absent   conc   Male Female
2         152                       0.13 a 1
6         121                          1 a 2
21         92                          2 b 3
24         89                          5 b 4
0        141                         10 c 5
5         95                         15 c      6"),
header=TRUE)

Either of these should give you what you want:
  
d <- droplevels(transform(d,cross=interaction(Male,Female)))
levels(d$cross)

d <- transform(d,cross=paste(Male,Female,sep="."))
levels(d$cross)

You should be a little careful -- if each cross is exposed
only to a single concentration, and if you treat cross as
a fixed effect, you will overparameterize your model.  If
you treat it as a random effect, e.g. using glmer in the
lme4 package:

glmer(cbind(Present,Absent)~conc+(1|cross),data=d)

you will effectively be fitting a model for overdispersion
(see the example in ?cbpp).



More information about the R-help mailing list