[R-sig-ME] logistic growth, vexing choice of which timepoints

Sat Feb 7 23:16:56 CET 2009

Hi Jenny
What Steven said below is true, the zeros are below the detection limit.
However,
one might ask if the time until the populations cross the detection
threshold matters?
For instance if two wells treated differently both have similar logistic
curves, but one 
starts accelerating at t(i) and the other at t(j), j > i that does
provide some information about
what is going on below the detection limit. A sophisticated approach
might be to fit a joint model
modeling the time to the event, and the logistic growth simultaneously.
Given that that is 
hard, and there may not be any software to do it, you might want to fit
the survival model (time to event)
and then the logistic growth model n the non-zero data. This is very
add-hoc, but will at least give you
some idea of whether it is worth chasing a more complicated model. This
will be more effective if 
you have replicate wells. 

Nicholas

> Message: 1
> Date: Fri, 6 Feb 2009 13:54:47 -0800
> From: Jenny Bryan <jenny at stat.ubc.ca>
> Subject: [R-sig-ME] logistic growth,	vexing choice of which timepoints
> 	to include
> To: r-sig-mixed-models at r-project.org
> Message-ID: <4B7E17CD-C967-479F-85CD-97404D975969 at stat.ubc.ca>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> 
> Hello.  I'm looking for advice on how to make a seemingly unavoidable  
> subjective choice in an analysis I'm doing, using the logistic growth  
> model.  I'm using nlme, but that has nothing to do with my question,  
> so I hope it's not too inappropriate for me to post this here.   
> Reading the list archive suggests that it's not too hard to tempt this  
> group into philosophical discussions :-)
> 
> I have growth data that can be reasonably modelled with a four- 
> parameter logistic curve.  The experimental unit is a well in a  
> microtitre plate and I get light absorbance readings over time that  
> reflect cell density.  There are many wells on a plate, e.g. 96 or  
> 384, and experiments often span many plates.  Systematic differences  
> between the wells can be, for example, specific genetic mutations  
> carried by the cells and/or different chemicals added to the growth  
> medium.  I am mostly interested in performing inference on the fixed  
> effects, i.e. how the genetic perturbations, the chemicals, or their  
> interactions, modify key growth parameters, especially the one  
> inversely related to the underlying exponential growth rate we'd see  
> in the absence of resource constraints (phi_4 in Pinheiro & Bates p.  
> 517).
> 
> Problem:  The number of cells inoculated into the wells at the start  
> is quite small -- well below the detection threshhold for the light  
> absorbance readings.  Therefore, each timecourse begins with a loooong  
> string of zeros, before the classic sigmoidal shape kicks in.  And, of  
> course, the timing of this happy event is both ill-defined and very  
> variable across the wells.
> 
> For figure-making purposes, I removed some early timepoints that were  
> uniformly zero for all wells.  Which made me wonder: why couldn't  
> (shouldn't?) I do the same prior to model fitting?  When I fit the  
> logistic growth model with and without these early timepoints, I get  
> essentially the same estimated fixed effects and, even, estimated  
> variances for the random effects.  But there *is* a substantial  
> difference in the estimate of residual variance, which then obviously  
> has a noticeable effect on the inference for the fixed effects and,  
> especially, the one I care about.  Including all the timepoints drives  
> the residual variance down, as you might expect.  But that almost  
> seems misleading or artificial ... other collaborators I work with  
> don't even start taking OD readings until the first 12 hours have  
> passed, which makes their initial strings of zeros quite short,  
> which ... gives them less statistical significance for the same  
> observed effect size?!?
> 
> Does anyone have a comment or advice?
> 
> Thanks in advance for reading this,
> Jenny
> 
> Jennifer Bryan
> Department of Statistics and
>    the Michael Smith Laboratories
> University of British Columbia
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Fri, 6 Feb 2009 15:18:53 -0800
> From: Steven McKinney <smckinney at bccrc.ca>
> Subject: Re: [R-sig-ME] logistic growth,	vexing choice of which
> 	timepoints to include
> To: "Jenny Bryan" <jenny at stat.ubc.ca>,
> 	<r-sig-mixed-models at r-project.org>
> Message-ID:
> 	<0BE438149FF2254DB4199E2682C8DFEB0328A5AC at crcmail1.BCCRC.CA>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Hi Jenny,
> 
> [Caveat: Comments from an applied statistician, not
> a world-heavyweight likelihood theorist]
> 
> In the logistic world a zero value maps to a 
> minus infinity value.  It seems to me that only
> the 'last' zero value contains any information
> relevant to the likelihood (equivalently only
> the 'first' one value (plus infinity in the
> logistic realm) contains any information
> relevant to the likelihood).  Perhaps the
> coding for the likelihood has not been set
> up to take this into account so you are getting
> the artificial contribution of the rest of the
> zero values folded into the likelihood, artificially
> deflating the variance estimates.
> 
> I would exclude or set to NA all but the last
> (or even all of) the zero values for any well
> and all but the first (or even all of) the one 
> values.
> 
> The zero values are really below the detection
> limit of the sensor involved so should theoretically
> be handled as truncated data but that's another
> level of complexity for the analysis.
> 
> Steven McKinney
> 
> Statistician
> Molecular Oncology and Breast Cancer Program
> British Columbia Cancer Research Centre
> 
> email: smckinney +at+ bccrc +dot+ ca
> 
> tel: 604-675-8000 x7561
> 
> BCCRC
> Molecular Oncology
> 675 West 10th Ave, Floor 4
> Vancouver B.C. 
> V5Z 1L3
> Canada
> 
> 
> 
>