[BioC] identifying consistently expressed genes between replicates
Gordon K Smyth
smyth at wehi.EDU.AU
Mon Apr 11 09:22:17 CEST 2011
Hi Wendy,
It occured to me after sending my last email that the code I gave you will
compute contrasts in the form OtherCell-BCELLA2 rather than
BCELLA2-OtherCell, so you need to use results<0 for positive signature
genes and results>0 for negative, i.e., the other way around to my email.
Best wishes
Gordon
On Mon, 11 Apr 2011, Gordon K Smyth wrote:
> Hi Wendy,
>
> First, let me mention that fit$sigma holds the between-replicate standard
> deviation for each gene, which is probably what you were looking for in your
> original post.
>
> Second, here is a way to compare each cell type with each of the others.
> Suppose you want signature genes for BCELLA2. The following will compare all
> other cell types back to BCELLA2:
>
> f <- factor(samplenames)
> BCELLA2vs <- relevel(f,ref="BCELLA2")
> design <- model.matrix(~BCELLA2vs)
> fit <- eBayes(lmFit(es.mx,design))
>
> Now do all the pairwise tests asking for FDR better than 0.1 and fold change
> at least 1.5 (you can choose the settings you want):
>
> results <- decideTests(fit[,-1], p=0.1, lfc=log2(1.5))
>
> You can find the indices of positive signature genes that are up in all
> comparisons by:
>
> i <- apply(results>0,1,all)
>
> or negative signature genes by
>
> i <- apply(results<0,1,all)
>
> However, you have so many cell types, some of which are probably quite
> similar. You might allow some of these comparisons to be non-significant.
> Suppose you decide to restrict to genes that are up in BCELLA2 vs 20 out of
> the 23 other cell types:
>
> i <- rowSums(results>0) >= 20
>
> You can see that any variation of this is quite easy.
>
> Best wishes
> Gordon
>
>
> On Sun, 10 Apr 2011, Wendy Qiao wrote
>
>> Dear Gordon,
>>
>> Thank you very much for your information.
>>
>> You are right-I am comparing each cell type to the average of all the
>> others. Ideally, I want to compare each cell type to the others
>> pairwisely and find the signature genes as you suggested. I tried this
>> before, but I am afraid that I did not take the full advantages of
>> limma as I am new here.
>> Here is my problem. I am comparing 24 blood cell types (92 arrays in
>> total). Following are the steps that I took. The pairwise comparison
>> take dozens of ligands. Then I used topTable to find overexpressed
>> genes from each comparison, and finally do the 'intersect'. I believe
>> that there is an easy way to do all the pairwise comparisons and use
>> decideTests(). Would you mind giving me some hints on that?
>>
>> Thank you very much.
>> Wendy
>>
>> f<-factor(samplenames) #sampelenames = colnames of 92 arrays with
>> replicates have the same name
>> design<-model.matrix(~0+f)
>> fit<-lmFit(es.mx,design)
>> fit<-eBayes(fit)
>>
>> contrast.matrix<-makeContrasts(fBASO1-fBCELLA1, fBASO1-fBCELLA2.....
>>
>>
>>
>> fBASO1 fBCELLA1 fBCELLA2 fBCELLA3 ...
>> 1 1 0 0 0 ...
>> 2 1 0 0 0 ...
>> 3 1 0 0 0 ...
>> 4 0 1 0 0 ...
>> ...
>> 92 0 0 0 0 ...
>>
>>
>> On 10 April 2011 18:30, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>
>>> Dear Wendy,
>>>
>>> From your email, I assume that you have found signature genes by
>>> comparing each cell type to all the other cell types treated as one
>>> group. As you have correctly observed, this does not take account of
>>> consistency within the other cell types. Another way to find
>>> signature genes, that I think is superior, is to choose signature
>>> genes to be those genes that are uniquely higher or lower in the
>>> relevant cell type with respect to each of the other cell types
>>> individually. In other words, a positive signature gene is higher in
>>> the relevant cell type against every other cell type, not just against
>>> the average of the other cell types. This was the method used in:
>>>
>>> Lim E, Vaillant F, Wu D, Forrest NC, Pal B, Hart AH, Asselin-Labat ML,
>>> Gyorki DE, Ward T, Partanen A, Feleppa F, Huschtscha LI, Thorne HJ;
>>> kConFab,
>>> Fox SB, Yan M, French JD, Brown MA, Smyth GK, Visvader JE, Lindeman GJ.
>>> Aberrant luminal progenitors as the candidate target population for basal
>>> tumor development in BRCA1 mutation carriers. Nature Medicine 2009.
>>>
>>> to find stem cell signature genes. If you do it this way, consistency
>>> within the cell types is automatically taken care off, because the
>>> t-tests will only choose genes with consistent behaviour. limma can
>>> do all the relevant pairwise tests for you in a couple of lines, then
>>> use decideTests() to choose the signature genes.
>>>
>>> Best wishes
>>> Gordon
>>>
>>> ---------------------------------------------
>>> Professor Gordon K Smyth,
>>> NHMRC Senior Research Fellow,
>>> Bioinformatics Division,
>>> Walter and Eliza Hall Institute of Medical Research,
>>> 1G Royal Parade, Parkville, Vic 3052, Australia.
>>> Tel: (03) 9345 2326, Fax (03) 9347 0852,
>>> smyth at wehi.edu.au
>>> http://www.wehi.edu.au
>>> http://www.statsci.org/smyth
>>>
>>>
>>> Date: Sat, 9 Apr 2011 19:57:25 -0400
>>>> From: Wendy Qiao <wendy2.qiao at gmail.com>
>>>> To: bioconductor at r-project.org
>>>> Subject: [BioC] identifying consistently expressed genes between
>>>> replicates
>>>>
>>>> Hi all,
>>>>
>>>> I am comparing a number of cell types, and am wanting to find the
>>>> signature genes of each cell type. I used the limma package to do this.
>>>> The signature genes of a given cell type are found by the fold different
>>>> between the given cell type and grand mean of all the cell types, as well
>>>> as the BH-adjusted p-values. I want to add another condition to test the
>>>> consistency of expression levels of the selected genes for each cell
>>>> type. I can do this by looking at the standard deviations of gene
>>>> expressions between replicates. I am just wondering if there is any
>>>> function in limma or other BioConductor package to do this.
>>>>
>>>> Thank you in advance,
>>>> Wendy
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list