[BioC] uneven counts for edgeR

Gordon K Smyth smyth at wehi.EDU.AU
Thu Oct 27 01:45:06 CEST 2011


Dear Lana,

It sounds strange, but it would be unwise for me to comment without 
knowing what they mean.  It is of course technically impossible for RPKM 
to be negative binomial or Poisson because RPKM values are not integers.

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
smyth at wehi.edu.au
http://www.wehi.edu.au
http://www.statsci.org/smyth

On Wed, 26 Oct 2011, Lana Schaffer wrote:

> Gordon,
> An unnamed company is claiming that the RPKM counts and/or
> Some transformation of the RPKM counts is 90% normal, 5% NB,
> And 5% poisson distribution using the Akaiki Information Criteria.
> Can you explain why this is or is not plausable?
>
> Lana
>
> -----Original Message-----
> From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU]
> Sent: Saturday, October 22, 2011 5:42 PM
> To: Lana Schaffer
> Cc: Bioconductor mailing list
> Subject: uneven counts for edgeR
>
> Dear Lana,
>
> edgeR has no difficulty with uneven library sizes, and will adjust for
> this automatically for this during the analysis.  There is no need for you
> to do anything other than follow a standard analysis pipeline.
>
> You do not need to standardize the 4th sample by dividing the counts by
> dividing by 4, in fact you must not do this since it changes the
> mean-variance relationship for your data and invalidates the subsequent
> analysis.  You need to input the true read counts into edgeR.
>
> Best wishes
> Gordon
>
>
>> Date: Fri, 21 Oct 2011 15:27:25 -0700
>> From: Lana Schaffer <schaffer at scripps.edu>
>> To: "'bioconductor at r-project.org'" <bioconductor at r-project.org>
>> Subject: [BioC] uneven counts for edgeR
>>
>> Hi,
>> I have replicate sample counts for 2 groups but one sample is 4x number of mapped reads
>> Than the other samples.
>> 528,428
>>
>> 625,889
>>
>> 498,569
>>
>> 2,328,333
>>
>> I divided all the mapped transcript reads by 4 and then did the
>> normalization and analysis With edgeR. What do you recommend to do with
>> the 4th sample counts?
>>
>> Lana Schaffer
>> Biostatistics, Informatics
>> DNA Array Core Facility
>> 858-784-2263

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list