[R-sig-ME] logistic growth, vexing choice of which timepoints to include

Sun Feb 8 21:20:02 CET 2009

Hi Jenny,

(this one coming from a total non statistician, so excuse me my 
naivety).

If you are interested on how the growth rate parameter depends on 
factors such as antimicrobial concentration or genetic expression, 
eliminating or letting points in your model fitting to influence your 
parameters variance makes me think two things: 

1) If the variance of your growth rate parameter changes a lot 
depending on the number of zero points that you use for the fit, please 
consider that a microtiter reader is an automated  experiment and the 
experimental design may be accomodated for your purpose: The data 
acquisition rate can be modified so that you have as many time points 
as you want in  the "important" part of the curve (your exponential 
growth phase for the growth rate) and then the zero growth data should 
have a minimal influence. There would be an optimal design strategy for 
that data acquisition if you had some previous knowledge of the 
parameters of your growth curves.

2) By deleting no-growth data, aren't you giving more weight to the 
experiments where your antimicrobial wasn't effective? Wouldn't be 
informative to have information on appropriate antimicrobial 
concentration ranges to ensure that your secondary model is able to 
predict zero-growth-rate conditions? 

The pile of zeros before the important data arrives is below the limit 
of detection of growth as mentioned in the previous email, so you 
should be able to take them out, but you need to think about those two 
things as well...

regards,

Jesus

School of Food Science and Environmental Health
Dublin Institute of Technology
Dublin, Ireland

Jenny Bryan wrote:

>Hello.  I'm looking for advice on how to make a seemingly unavoidable  
>subjective choice in an analysis I'm doing, using the logistic growth  
>model.  I'm using nlme, but that has nothing to do with my question,  
>so I hope it's not too inappropriate for me to post this here.   
>Reading the list archive suggests that it's not too hard to tempt this  
>group into philosophical discussions :-)
>
>I have growth data that can be reasonably modelled with a four- 
>parameter logistic curve.  The experimental unit is a well in a  
>microtitre plate and I get light absorbance readings over time that  
>reflect cell density.  There are many wells on a plate, e.g. 96 or  
>384, and experiments often span many plates.  Systematic differences  
>between the wells can be, for example, specific genetic mutations  
>carried by the cells and/or different chemicals added to the growth  
>medium.  I am mostly interested in performing inference on the fixed  
>effects, i.e. how the genetic perturbations, the chemicals, or their  
>interactions, modify key growth parameters, especially the one  
>inversely related to the underlying exponential growth rate we'd see  
>in the absence of resource constraints (phi_4 in Pinheiro & Bates p.  
>517).
>
>Problem:  The number of cells inoculated into the wells at the start  
>is quite small -- well below the detection threshhold for the light  
>absorbance readings.  Therefore, each timecourse begins with a loooong  
>string of zeros, before the classic sigmoidal shape kicks in.  And, of  
>course, the timing of this happy event is both ill-defined and very  
>variable across the wells.
>
>For figure-making purposes, I removed some early timepoints that were  
>uniformly zero for all wells.  Which made me wonder: why couldn't  
>(shouldn't?) I do the same prior to model fitting?  When I fit the  
>logistic growth model with and without these early timepoints, I get  
>essentially the same estimated fixed effects and, even, estimated  
>variances for the random effects.  But there *is* a substantial  
>difference in the estimate of residual variance, which then obviously  
>has a noticeable effect on the inference for the fixed effects and,  
>especially, the one I care about.  Including all the timepoints drives  
>the residual variance down, as you might expect.  But that almost  
>seems misleading or artificial ... other collaborators I work with  
>don't even start taking OD readings until the first 12 hours have  
>passed, which makes their initial strings of zeros quite short,  
>which ... gives them less statistical significance for the same  
>observed effect size?!?
>
>Does anyone have a comment or advice?
>
>Thanks in advance for reading this,
>Jenny
>
>Jennifer Bryan
>Department of Statistics and
>   the Michael Smith Laboratories
>University of British Columbia
>
>_______________________________________________
>R-sig-mixed-models at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
This message has been scanned for content and viruses by the DIT Information Services E-Mail Scanning Service, and is believed to be clean. http://www.dit.ie