[BioC] identifying consistently expressed genes between replicates

Gordon K Smyth smyth at wehi.EDU.AU
Mon Apr 11 09:22:17 CEST 2011


Hi Wendy,

It occured to me after sending my last email that the code I gave you will 
compute contrasts in the form OtherCell-BCELLA2 rather than 
BCELLA2-OtherCell, so you need to use results<0 for positive signature 
genes and results>0 for negative, i.e., the other way around to my email.

Best wishes
Gordon

On Mon, 11 Apr 2011, Gordon K Smyth wrote:

> Hi Wendy,
>
> First, let me mention that fit$sigma holds the between-replicate standard 
> deviation for each gene, which is probably what you were looking for in your 
> original post.
>
> Second, here is a way to compare each cell type with each of the others. 
> Suppose you want signature genes for BCELLA2.  The following will compare all 
> other cell types back to BCELLA2:
>
>  f <- factor(samplenames)
>  BCELLA2vs <- relevel(f,ref="BCELLA2")
>  design <- model.matrix(~BCELLA2vs)
>  fit <- eBayes(lmFit(es.mx,design))
>
> Now do all the pairwise tests asking for FDR better than 0.1 and fold change 
> at least 1.5 (you can choose the settings you want):
>
>  results <- decideTests(fit[,-1], p=0.1, lfc=log2(1.5))
>
> You can find the indices of positive signature genes that are up in all 
> comparisons by:
>
> i <- apply(results>0,1,all)
>
> or negative signature genes by
>
> i <- apply(results<0,1,all)
>
> However, you have so many cell types, some of which are probably quite 
> similar.  You might allow some of these comparisons to be non-significant. 
> Suppose you decide to restrict to genes that are up in BCELLA2 vs 20 out of 
> the 23 other cell types:
>
> i <- rowSums(results>0) >= 20
>
> You can see that any variation of this is quite easy.
>
> Best wishes
> Gordon
>
>
> On Sun, 10 Apr 2011, Wendy Qiao wrote
>
>> Dear Gordon,
>> 
>> Thank you very much for your information.
>> 
>> You are right-I am comparing each cell type to the average of all the 
>> others. Ideally, I want to compare each cell type to the others 
>> pairwisely and find the signature genes as you suggested. I tried this 
>> before, but I am afraid that I did not take the full advantages of 
>> limma as I am new here.

>> Here is my problem. I am comparing 24 blood cell types (92 arrays in 
>> total). Following are the steps that I took. The pairwise comparison 
>> take dozens of ligands. Then I used topTable to find overexpressed 
>> genes from each comparison, and finally do the 'intersect'. I believe 
>> that there is an easy way to do all the pairwise comparisons and use 
>> decideTests(). Would you mind giving me some hints on that?
>> 
>> Thank you very much.
>> Wendy
>> 
>> f<-factor(samplenames)   #sampelenames = colnames of 92 arrays with
>> replicates have the same name
>> design<-model.matrix(~0+f)
>> fit<-lmFit(es.mx,design)
>> fit<-eBayes(fit)
>> 
>> contrast.matrix<-makeContrasts(fBASO1-fBCELLA1, fBASO1-fBCELLA2.....
>> 
>> 
>>
>>   fBASO1 fBCELLA1 fBCELLA2 fBCELLA3 ...
>> 1       1        0        0        0     ...
>> 2       1        0        0        0    ...
>> 3       1        0        0        0      ...
>> 4       0        1        0        0     ...
>> ...
>> 92      0        0        0        0   ...
>> 
>> 
>> On 10 April 2011 18:30, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>> 
>>> Dear Wendy,
>>> 
>>> From your email, I assume that you have found signature genes by 
>>> comparing each cell type to all the other cell types treated as one 
>>> group.  As you have correctly observed, this does not take account of 
>>> consistency within the other cell types.  Another way to find 
>>> signature genes, that I think is superior, is to choose signature 
>>> genes to be those genes that are uniquely higher or lower in the 
>>> relevant cell type with respect to each of the other cell types 
>>> individually.  In other words, a positive signature gene is higher in 
>>> the relevant cell type against every other cell type, not just against 
>>> the average of the other cell types.  This was the method used in:
>>> 
>>> Lim E, Vaillant F, Wu D, Forrest NC, Pal B, Hart AH, Asselin-Labat ML,
>>> Gyorki DE, Ward T, Partanen A, Feleppa F, Huschtscha LI, Thorne HJ; 
>>> kConFab,
>>> Fox SB, Yan M, French JD, Brown MA, Smyth GK, Visvader JE, Lindeman GJ.
>>> Aberrant luminal progenitors as the candidate target population for basal
>>> tumor development in BRCA1 mutation carriers.  Nature Medicine 2009.
>>> 
>>> to find stem cell signature genes.  If you do it this way, consistency 
>>> within the cell types is automatically taken care off, because the 
>>> t-tests will only choose genes with consistent behaviour.  limma can 
>>> do all the relevant pairwise tests for you in a couple of lines, then 
>>> use decideTests() to choose the signature genes.
>>> 
>>> Best wishes
>>> Gordon
>>> 
>>> ---------------------------------------------
>>> Professor Gordon K Smyth,
>>> NHMRC Senior Research Fellow,
>>> Bioinformatics Division,
>>> Walter and Eliza Hall Institute of Medical Research,
>>> 1G Royal Parade, Parkville, Vic 3052, Australia.
>>> Tel: (03) 9345 2326, Fax (03) 9347 0852,
>>> smyth at wehi.edu.au
>>> http://www.wehi.edu.au
>>> http://www.statsci.org/smyth
>>> 
>>>
>>>  Date: Sat, 9 Apr 2011 19:57:25 -0400
>>>> From: Wendy Qiao <wendy2.qiao at gmail.com>
>>>> To: bioconductor at r-project.org
>>>> Subject: [BioC] identifying consistently expressed genes between
>>>>        replicates
>>>> 
>>>> Hi all,
>>>> 
>>>> I am comparing a number of cell types, and am wanting to find the 
>>>> signature genes of each cell type. I used the limma package to do this. 
>>>> The signature genes of a given cell type are found by the fold different 
>>>> between the given cell type and grand mean of all the cell types, as well 
>>>> as the BH-adjusted p-values. I want to add another condition to test the 
>>>> consistency of expression levels of the selected genes for each cell 
>>>> type. I can do this by looking at the standard deviations of gene 
>>>> expressions between replicates. I am just wondering if there is any 
>>>> function in limma or other BioConductor package to do this.
>>>> 
>>>> Thank you in advance,
>>>> Wendy
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list