[R-sig-ME] logistic growth, vexing choice of which timepoints to include
Steven McKinney
smckinney at bccrc.ca
Sat Feb 7 00:18:53 CET 2009
Hi Jenny,
[Caveat: Comments from an applied statistician, not
a world-heavyweight likelihood theorist]
In the logistic world a zero value maps to a
minus infinity value. It seems to me that only
the 'last' zero value contains any information
relevant to the likelihood (equivalently only
the 'first' one value (plus infinity in the
logistic realm) contains any information
relevant to the likelihood). Perhaps the
coding for the likelihood has not been set
up to take this into account so you are getting
the artificial contribution of the rest of the
zero values folded into the likelihood, artificially
deflating the variance estimates.
I would exclude or set to NA all but the last
(or even all of) the zero values for any well
and all but the first (or even all of) the one
values.
The zero values are really below the detection
limit of the sensor involved so should theoretically
be handled as truncated data but that's another
level of complexity for the analysis.
Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre
email: smckinney +at+ bccrc +dot+ ca
tel: 604-675-8000 x7561
BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org on behalf of Jenny Bryan
Sent: Fri 2/6/2009 1:54 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] logistic growth,vexing choice of which timepoints to include
Hello. I'm looking for advice on how to make a seemingly unavoidable
subjective choice in an analysis I'm doing, using the logistic growth
model. I'm using nlme, but that has nothing to do with my question,
so I hope it's not too inappropriate for me to post this here.
Reading the list archive suggests that it's not too hard to tempt this
group into philosophical discussions :-)
I have growth data that can be reasonably modelled with a four-
parameter logistic curve. The experimental unit is a well in a
microtitre plate and I get light absorbance readings over time that
reflect cell density. There are many wells on a plate, e.g. 96 or
384, and experiments often span many plates. Systematic differences
between the wells can be, for example, specific genetic mutations
carried by the cells and/or different chemicals added to the growth
medium. I am mostly interested in performing inference on the fixed
effects, i.e. how the genetic perturbations, the chemicals, or their
interactions, modify key growth parameters, especially the one
inversely related to the underlying exponential growth rate we'd see
in the absence of resource constraints (phi_4 in Pinheiro & Bates p.
517).
Problem: The number of cells inoculated into the wells at the start
is quite small -- well below the detection threshhold for the light
absorbance readings. Therefore, each timecourse begins with a loooong
string of zeros, before the classic sigmoidal shape kicks in. And, of
course, the timing of this happy event is both ill-defined and very
variable across the wells.
For figure-making purposes, I removed some early timepoints that were
uniformly zero for all wells. Which made me wonder: why couldn't
(shouldn't?) I do the same prior to model fitting? When I fit the
logistic growth model with and without these early timepoints, I get
essentially the same estimated fixed effects and, even, estimated
variances for the random effects. But there *is* a substantial
difference in the estimate of residual variance, which then obviously
has a noticeable effect on the inference for the fixed effects and,
especially, the one I care about. Including all the timepoints drives
the residual variance down, as you might expect. But that almost
seems misleading or artificial ... other collaborators I work with
don't even start taking OD readings until the first 12 hours have
passed, which makes their initial strings of zeros quite short,
which ... gives them less statistical significance for the same
observed effect size?!?
Does anyone have a comment or advice?
Thanks in advance for reading this,
Jenny
Jennifer Bryan
Department of Statistics and
the Michael Smith Laboratories
University of British Columbia
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list