[BioC] batch effect on variances

Robert Gentleman rgentlem at fhcrc.org
Tue Sep 26 20:09:25 CEST 2006



Lana Schaffer wrote:
> Hi,
> This is just a case of matching profiles during something
> like a time course.  In packages like GeneSpring and Spotfile the user
> is allowed to chose a profile and then find other "genes" with matching
> profiles for varying correlations.  In this case I have pvalues that are
> significant for differential expression at short time periods and not
> significant at long time periods.  I want to get the list of genes with that
> profile.

   genefinder in the genefilter package - does something like this.

> Alternatively, the log expression ratio is high at short time period and
> then levels off
> at long time periods for all the genes of interest.  I thought that the high
> pvalues
> was due to increased variance and therefore heterogeneity. I don't know how
> to
> think about the decreased expression along with this, since I am dealing
> with
> differential expression.  I have done co-expression analysis for all these
> genes and
> find them to be co-expressed between 2 modules.  Show the goal is to show
> levels of hetergeneity between the time periods.
> 
> I am wondering if I used limma correctly for I divided up the samples into
> 3 "time periods" and then then fit the samples together.  I then used
> contrasts to get adjusted pvalues for the genes for the 3 "time periods".
> When
> I graphed the trends in pvalues for each of the genes over time I get
> profiles which
> increase and then flatten for a set of genes (I want to get that set of
> genes)
> and then other profiles.  I want to show that the
> variance (hetergeneity) increases with time with some of the genes.

   I do not understand your pre-occupation with p-values. I think you 
should be interested in patterns of expression, not patterns in the 
p-values.


> I think that I could do a multivariate regression to indicate a regression
> in
> differential expression, but then if the ratio is leveling off then
> regression
> won't tell me anything.

   That is why you need to fully specify the profile of interest, and 
then measure distances from it.


> I hope you can understand where I am going.
> Lana
> 
> 
> ----- Original Message ----- 
> From: "Robert Gentleman" <rgentlem at fhcrc.org>
> To: "Lana Schaffer" <schaffer at scripps.edu>
> Cc: <bioconductor at stat.math.ethz.ch>
> Sent: Tuesday, September 26, 2006 9:02 AM
> Subject: Re: [BioC] batch effect on variances
> 
> 
>>
>> Lana Schaffer wrote:
>>> Hi,
>>> I want to find out if there is a batch effect (FEM or REM) on the 
>>> variance for 2 sets of
>>> data which are discrete (different) treatments (time).  The GeneMeta 
>>> package is designed
>>> to combine batches which measure the same treatment effects.  However, I 
>>> have what
>>> corresponds to 2 different treatment effects.  Is it valid to check 
>>> homogeneity for the 2 batches?
>>  Hi,
>>    You can do some things, but I am not sure why you care? If the two 
>> experiments do not have the same treatments then there is no sensible 
>> analysis that combines them, so whether or not the variances are the same, 
>> seems like an odd question, at least to me.  What would you want to say 
>> about it and how might you try and use it?
>>
>>   You can fit an appropriate model to each gene in each experiment 
>> separately, say using limma or any of the multitude of packages in BioC to 
>> do this. Once that has been done, you can estimate per gene variances, and 
>> then their ratio, suitably normalized will almost surely follow some form 
>> of F statistic (provided that samples are not too small and that the 
>> models are reasonable).  But I am still not sure what you would do with 
>> such information.
>>
>>   best wishes
>>     Robert
>>
>>
>>> Lana Schaffer
>>> Biostatistics/Informatics
>>> The Scripps Research Institute
>>> DNA Array Core Facility
>>> La Jolla, CA 92037
>>> (858) 784-2263
>>> (858) 784-2994
>>> schaffer at scripps.edu
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>> -- 
>> Robert Gentleman, PhD
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M2-B876
>> PO Box 19024
>> Seattle, Washington 98109-1024
>> 206-667-7700
>> rgentlem at fhcrc.org
>>
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list