[BioC] adjusted p values of the genes in the intersection of Venn

Robert Gentleman rgentlem at fhcrc.org
Thu Nov 16 01:34:25 CET 2006


Hi,
  Margaret, you have given us rather little information to go on and you 
almost surely will need to discuss your problem/analysis with some local 
statistician as these things are simply too hard to work out by email, 
in my experience.

  but imagine for a moment a scenario where you have some thing nice 
like a cell line, and that you have some samples that represent 
baseline, some treated with A and some with B. Then you could do 
baseline vs A and baseline vs B separately (sort of what you have 
suggested), but that is both inefficient (in the statistical sense) and 
makes the analysis you now claim to want to do.

  Using any software that allows you to fit linear models (genefilter 
and or limma are two R packages that provide some capabilities in that 
direction) you can set up a model where the response y, log expression, 
is modeled as

   y = b0 + b1 IA + b2 IB + epsilon

  where IA is 1 if the array was treated with A, and 0 otherwise
        IB is 1 if the array was treated with B, and 0 otherwise

  then, you simply want to find that set of genes where both b1 and b2 
are significantly different from 0 and from that test obtain p-values 
and adjust them, as you see fit. And this has nothing really to do with 
the size of the effect for either A or B, so I am slightly confused by 
your response to Ben.

  Caveat: there are many many (did I say many?) reasons why this would 
not be appropriate, and I am not going into all of them, but a few are 
that the samples for A and B are not comparable, or perhaps they were 
done on two color arrays with different references, or.....

   and only a few where it is the right thing to do - hence the need for 
a local statistician.

   But, in a nut shell - p-values give us some idea about hypothesis we 
tested, and there are no simple ways to combine p-values from different 
experiments, as you initially asked. So, you must either give up on 
p-values, adjusted or otherwise, or fit an omnibus model that allows you 
to directly test the hypothesis of interest (if that is appropriate, 
given the data you have).

  best wishes
    Robert


Margaret Gardiner-Garden wrote:
> Dear Benjamin, Thanks for your thoughts.
> You are right that the hypothesis for each gene in the intersection would be
> the gene is affected by both treatments.  However, we cannot merge the
> groups into treated versus nontreated because the increase in expression
> with treatment A is generally higher than that for treatment B ie they are
> not equivalent treatments.
> 
> Regards
> Marg
> -----Original Message-----
> From: Benjamin Otto [mailto:b.otto at uke.uni-hamburg.de] 
> Sent: Wednesday, 15 November 2006 11:38 PM
> To: 'Margaret Gardiner-Garden'; bioconductor at stat.math.ethz.ch
> Subject: AW: [BioC] adjusted p values of the genes in the intersection of
> Venn
> 
> Hi Margaret,
> 
> I'm not quite sure if one can "merge" some FDR values.
> What hypothesis would you want the new FDR to stand for? Something like "is
> the gene affected by both treatments.."? How about merging the four groups
> (treatment A, treatment A reference, treatment B, treatment B reference) to
> two newer ones (treated, non-treated) and look what FDR values these genes
> get?  Please correct me if I misunderstood your question. :)
> 
> Regards
> Benjamin
> 
> -----Ursprüngliche Nachricht-----
> Von: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] Im Auftrag von Margaret
> Gardiner-Garden
> Gesendet: 15 November 2006 06:07
> An: bioconductor at stat.math.ethz.ch
> Betreff: [BioC] adjusted p values of the genes in the intersection of Venn
> 
> Hi,  We have two lists of genes (one regulated by treatment A and the other
> regulated by treatment B ).  Each list contains genes with a BY adjusted p
> value (or FDR) <0.01.  As we expected from the biology, when we do a Venn
> diagram many of the genes that are affected by treatment A are also affected
> by treatment B ie lie in the intersection.  We are wondering how to
> calculate the FDR of the intersection set.  If the gene belongs to both
> sets, does this mean that its FDR is now less than 0.01? And does this then
> mean that the genes that are not in the intersection set (ie are exclusive
> to treatment A or treatment B) have a FDR more than 0.01?
> 
>  
> 
> We would really appreciate any advice that people can give...
> 
>  
> 
> Thanks in advance,
> 
> Marg
> 
>  
> 
> Dr Margaret Gardiner-Garden
> 
> Garvan Institute of Medical Research
> 
> 384 Victoria Street
> 
> Darlinghurst Sydney
> 
> NSW 2010 Australia
> 
>  
> 
> Phone: 61 2 9295 8348
> 
> Fax: 61 2 9295 8321
> 
>  
> 
>  
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list