[BioC] edgeR on ncRNA analysis question

Gordon K Smyth smyth at wehi.EDU.AU
Sun Dec 1 09:55:48 CET 2013


Dear Alessandro,

In the usual edgeR pipeline, one does not construct different datasets to 
make different comparisons.  Rather the idea is to analyse all the samples 
together, and simply to test different comparisons.  Scale normalization 
is done only once.

Best wishes
Gordon

On Sun, 1 Dec 2013, Genomnia - Guffanti Alessandro wrote:

> Hi - OK, thanks for the feedback, I will them look carefully at the
> procedure
>
> No, I did not have only WTA, but in the two comparisons the experiment
> samples were different - i.e. these are the same W.T. samples compared with
> two different set of experiment samples, which I did not copy in the output.
>
> Thanks again and keep in touch
>
> Alessandro
>
>
>
> -----------------------------------------------------
> Alessandro Guffanti - Head, Bioinformatics
> Genomnia srl
> Via Nerviano, 31/B – 20020 Lainate (MI)
> Tel. +39-0293305.702 / Fax +39-0293305.777
> www.genomnia.com [http://www.genomnia.com/]
> alessandro.guffanti at genomnia.com [mailto:alessandro.guffanti at genomnia.com]
>
> Per cortesia, prima di stampare questa e-mail pensate all'ambiente.
> Please consider the environment before printing this mail note.
>
> -----Original Message-----
> From: Gordon K Smyth <smyth at wehi.EDU.AU>
> To: alessandro.guffanti at genomnia.com
> Cc: Bioconductor mailing list <bioconductor at r-project.org>, Mark Robinson
> <mark.robinson at imls.uzh.ch>
> Date: Sun, 1 Dec 2013 13:25:26 +1100 (AUS Eastern Daylight Time)
> Subject: edgeR on ncRNA analysis question
>
>
> It does look like you may have done something wrong.  In fact, the output
> doesn't make sense to me.  The CPM and average logCPM values output by
> edgeR should be unchanged regardless of the comparison you are testing, so
> the two output tables you give cannot be from the same data.  And you seem
> to have wildtype samples only??
>
> Normalization of ncRNA reads is very challenging, but there seems a much
> more basic problem here.
>
> In the absence of any code leading to the output given, it is impossible
> to say more.
>
> Best wishes
> Gordon
>
>> Date: Fri, 29 Nov 2013 12:06:40 +0100
>> From: alessandro.guffanti at genomnia.com
> [mailto:alessandro.guffanti%40genomnia.com]
>> To: Bioconductor mailing list <bioconductor at r-project.org
> [mailto:bioconductor%40r-project.org]>
>> Cc: bioinfo at genomnia.com [mailto:bioinfo%40genomnia.com]
>> Subject: [BioC] edgeR on ncRNA analysis question
>>
>> Der BioC edgeR developers and users:
>>
>> I am using edgeR for ncRNA transcriptome data analysis - ie mapping RNA
> seq
>> results only versus a ncRNA transcript database (bowtie from Color Space
>> reads)
>>
>> There seems to be, unsurprisingly, an high variability on these samples,
>> which affects obviously the FDR
>>
>> However, what surprised us is that the CPM for the same samples in
> different
>> comparisons (TMM-normalized) are always very different
>>
>> As an example:
>> *
>> **Comparison **A*
>>
>> Transcript_ID    logFC    logCPM    PValue    FDR    WT_4_CPM    WT_7.CPM
>  WT_10.CPM
>> ENST00000456355    1.42    10.91    0.00001    0.03283    2843    2926
> 2631
>>
>>
>> *
>> **Comparison **B
>>
>> *
>> Transcript_ID    logFC    logCPM    PValue    FDR    WT_4_CPM    WT_7.CPM
>  WT_10.CPM
>>
>>
>> ENST00000456355    0.91    11.11    0.00003    0.00361    190    341
> 157
>>
>>
>> Can TMM normalization affect so heavily the CPM values of the same
>> samples in different comparisons,
>> or do we have something else wrong here ?
>>
>> Thanks in advance for any feedback on this,
>>
>> Alessandro G
>>
>> ---

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:5}}


More information about the Bioconductor mailing list