[BioC] edgeR: calcNormFactors question

Mark Robinson mark.robinson at imls.uzh.ch
Fri Jun 22 11:50:29 CEST 2012


Hi Gowthaman,

You shouldn't manually specify the offset in glmFit(), unless you have a specific need to.  Short answer, you should use:

fit <- glmFit(d, design)


>>>> Lib Fe+.1 has only 4 million reads while other are 9 million +. But
>>>> still the norm.factors are not much different. With my naive
>>>> understanding i expect Fe+.1 to be very different from others. I would
>>>> like to know if what I see is okay?

This is ok, since the offset used in the downstream modeling is actually the product of the lib.size and norm.factors columns.

Best,
Mark

----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland

v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y11-J-16
w: http://tiny.cc/mrobin

----------
http://www.fgcz.ch/Bioconductor2012

On 22.06.2012, at 11:31, gowtham wrote:

> Hi Belinda,
> I think, i am bit confused now. The help document suggest, i should use
> only one of "offset" and "lib.size". Seems like both of them take the
> library size into account. And sounds like "offset" has a preference when
> both are supplied.
> 
> So, my question is do I have to explicitly ask for one or other? And do I
> have to explicitly give it a value?
> 
> 
> fit <- glmFit(d, design)
> 
> OR
> 
> 
> fit <- glmFit(d, design, offset=NULL)
> 
> OR
> 
> fit <- glmFit(d, design, lib.size=c(9664343, 11248827, 4194124, 9963626))
> 
> should I supply some values for "lib.sizes". Note, my DGEList already has
> library size information in it.
> 
> 
> Once again thanks for your answer and pointer to glmFit.
> Gowthaman
> 
> On Fri, Jun 22, 2012 at 2:18 AM, gowtham <ragowthaman at gmail.com> wrote:
> 
>> Thanks very much Belinda. That is comforting.
>> 
>> My DGEList object has library sizes added to it. Do I still need to supply
>> a numeric vector with library sizes while fiting glm? Or is it
>> automatically pulled from DGEList object?
>> 
>> Reading help, i understand its automatic. Please advice me if I am wrong.
>> " If y is a DGEList object then the default for lib.size is the product
>> of the library sizes and the normalization factors (in the samples slot
>> of the object). "
>> 
>> Thanks,
>> Gowthaman
>> 
>> 
>> 
>> 
>> On Thu, Jun 21, 2012 at 4:58 PM, Belinda Phipson <phipson at wehi.edu.au>wrote:
>> 
>>> Hi Gowthaman
>>> 
>>> Your output looks fine. What is more important is that library size is
>>> taken into account as an offset later on when you fit the glm. See
>>> help(glmFit).
>>> 
>>> Cheers,
>>> Belinda
>>> 
>>> -----Original Message-----
>>> From: bioconductor-bounces at r-project.org [mailto:
>>> bioconductor-bounces at r-project.org] On Behalf Of gowtham
>>> Sent: Friday, 22 June 2012 9:40 AM
>>> To: bioconductor
>>> Subject: Re: [BioC] edgeR: calcNormFactors question
>>> 
>>> Sorry about repeated mailing: I have attached a smear plot of the data
>>> incase that helps anyone attempting to answer my doubt.....
>>> 
>>> 
>>> On Thu, Jun 21, 2012 at 4:07 PM, gowtham <ragowthaman at gmail.com> wrote:
>>> 
>>>> Hi Everyone,
>>>> I am analyzing a RNAseq experiment with two groups each having two
>>>> replicates. One out of 4 libraries have only half as much reads
>>>> mapping to genome.
>>>> 
>>>> Lib Fe+.1 has only 4 million reads while other are 9 million +. But
>>>> still the norm.factors are not much different. With my naive
>>>> understanding i expect Fe+.1 to be very different from others. I would
>>>> like to know if what I see is okay?
>>>> 
>>>>> oldsetDGE <- calcNormFactors(oldsetDGE) oldsetDGE$samples
>>>>      group lib.size norm.factors
>>>> fe-.1     2  9664343    0.9865411
>>>> fe-.2     2 11248827    1.0812947
>>>> fe+.1     1  4194124    0.9662389
>>>> fe+.2     1  9963626    0.9701888
>>>> 
>>>> 
>>>> Thanks very much,
>>>> Gowthaman
>>>> --
>>>> Gowthaman
>>>> 
>>>> Bioinformatics Systems Programmer.
>>>> SBRI, 307 West lake Ave N Suite 500
>>>> Seattle, WA. 98109-5219
>>>> Phone : LAB 206-256-7188 (direct).
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Gowthaman
>>> 
>>> Bioinformatics Systems Programmer.
>>> SBRI, 307 West lake Ave N Suite 500
>>> Seattle, WA. 98109-5219
>>> Phone : LAB 206-256-7188 (direct).
>>> 
>>> 
>>> ______________________________________________________________________
>>> The information in this email is confidential and intended solely for the
>>> addressee.
>>> You must not disclose, forward, print or use it without the permission of
>>> the sender.
>>> ______________________________________________________________________
>>> 
>> 
>> 
>> 
>> --
>> Gowthaman
>> 
>> Bioinformatics Systems Programmer.
>> SBRI, 307 West lake Ave N Suite 500
>> Seattle, WA. 98109-5219
>> Phone : LAB 206-256-7188 (direct).
>> 
> 
> 
> 
> -- 
> Gowthaman
> 
> Bioinformatics Systems Programmer.
> SBRI, 307 West lake Ave N Suite 500
> Seattle, WA. 98109-5219
> Phone : LAB 206-256-7188 (direct).
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list