[BioC] edgeR: calcNormFactors question
Mark Robinson
mark.robinson at imls.uzh.ch
Fri Jun 22 11:50:29 CEST 2012
Hi Gowthaman,
You shouldn't manually specify the offset in glmFit(), unless you have a specific need to. Short answer, you should use:
fit <- glmFit(d, design)
>>>> Lib Fe+.1 has only 4 million reads while other are 9 million +. But
>>>> still the norm.factors are not much different. With my naive
>>>> understanding i expect Fe+.1 to be very different from others. I would
>>>> like to know if what I see is okay?
This is ok, since the offset used in the downstream modeling is actually the product of the lib.size and norm.factors columns.
Best,
Mark
----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland
v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y11-J-16
w: http://tiny.cc/mrobin
----------
http://www.fgcz.ch/Bioconductor2012
On 22.06.2012, at 11:31, gowtham wrote:
> Hi Belinda,
> I think, i am bit confused now. The help document suggest, i should use
> only one of "offset" and "lib.size". Seems like both of them take the
> library size into account. And sounds like "offset" has a preference when
> both are supplied.
>
> So, my question is do I have to explicitly ask for one or other? And do I
> have to explicitly give it a value?
>
>
> fit <- glmFit(d, design)
>
> OR
>
>
> fit <- glmFit(d, design, offset=NULL)
>
> OR
>
> fit <- glmFit(d, design, lib.size=c(9664343, 11248827, 4194124, 9963626))
>
> should I supply some values for "lib.sizes". Note, my DGEList already has
> library size information in it.
>
>
> Once again thanks for your answer and pointer to glmFit.
> Gowthaman
>
> On Fri, Jun 22, 2012 at 2:18 AM, gowtham <ragowthaman at gmail.com> wrote:
>
>> Thanks very much Belinda. That is comforting.
>>
>> My DGEList object has library sizes added to it. Do I still need to supply
>> a numeric vector with library sizes while fiting glm? Or is it
>> automatically pulled from DGEList object?
>>
>> Reading help, i understand its automatic. Please advice me if I am wrong.
>> " If y is a DGEList object then the default for lib.size is the product
>> of the library sizes and the normalization factors (in the samples slot
>> of the object). "
>>
>> Thanks,
>> Gowthaman
>>
>>
>>
>>
>> On Thu, Jun 21, 2012 at 4:58 PM, Belinda Phipson <phipson at wehi.edu.au>wrote:
>>
>>> Hi Gowthaman
>>>
>>> Your output looks fine. What is more important is that library size is
>>> taken into account as an offset later on when you fit the glm. See
>>> help(glmFit).
>>>
>>> Cheers,
>>> Belinda
>>>
>>> -----Original Message-----
>>> From: bioconductor-bounces at r-project.org [mailto:
>>> bioconductor-bounces at r-project.org] On Behalf Of gowtham
>>> Sent: Friday, 22 June 2012 9:40 AM
>>> To: bioconductor
>>> Subject: Re: [BioC] edgeR: calcNormFactors question
>>>
>>> Sorry about repeated mailing: I have attached a smear plot of the data
>>> incase that helps anyone attempting to answer my doubt.....
>>>
>>>
>>> On Thu, Jun 21, 2012 at 4:07 PM, gowtham <ragowthaman at gmail.com> wrote:
>>>
>>>> Hi Everyone,
>>>> I am analyzing a RNAseq experiment with two groups each having two
>>>> replicates. One out of 4 libraries have only half as much reads
>>>> mapping to genome.
>>>>
>>>> Lib Fe+.1 has only 4 million reads while other are 9 million +. But
>>>> still the norm.factors are not much different. With my naive
>>>> understanding i expect Fe+.1 to be very different from others. I would
>>>> like to know if what I see is okay?
>>>>
>>>>> oldsetDGE <- calcNormFactors(oldsetDGE) oldsetDGE$samples
>>>> group lib.size norm.factors
>>>> fe-.1 2 9664343 0.9865411
>>>> fe-.2 2 11248827 1.0812947
>>>> fe+.1 1 4194124 0.9662389
>>>> fe+.2 1 9963626 0.9701888
>>>>
>>>>
>>>> Thanks very much,
>>>> Gowthaman
>>>> --
>>>> Gowthaman
>>>>
>>>> Bioinformatics Systems Programmer.
>>>> SBRI, 307 West lake Ave N Suite 500
>>>> Seattle, WA. 98109-5219
>>>> Phone : LAB 206-256-7188 (direct).
>>>>
>>>
>>>
>>>
>>> --
>>> Gowthaman
>>>
>>> Bioinformatics Systems Programmer.
>>> SBRI, 307 West lake Ave N Suite 500
>>> Seattle, WA. 98109-5219
>>> Phone : LAB 206-256-7188 (direct).
>>>
>>>
>>> ______________________________________________________________________
>>> The information in this email is confidential and intended solely for the
>>> addressee.
>>> You must not disclose, forward, print or use it without the permission of
>>> the sender.
>>> ______________________________________________________________________
>>>
>>
>>
>>
>> --
>> Gowthaman
>>
>> Bioinformatics Systems Programmer.
>> SBRI, 307 West lake Ave N Suite 500
>> Seattle, WA. 98109-5219
>> Phone : LAB 206-256-7188 (direct).
>>
>
>
>
> --
> Gowthaman
>
> Bioinformatics Systems Programmer.
> SBRI, 307 West lake Ave N Suite 500
> Seattle, WA. 98109-5219
> Phone : LAB 206-256-7188 (direct).
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list