[BioC] liimma and Across Array Normalisation

Wed Feb 12 00:20:07 CET 2014

Although not documented, you can actually shorten the read command further 
to:

   x <- read.maimages(targets,source="genepix",green.only=TRUE)

The function will automatically look for a column called "FileName" in 
targets.

Gordon

On Wed, 12 Feb 2014, Gordon K Smyth wrote:

> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>
>> On 11-Feb-2014, at 10:52 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>> 
>>> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>>> 
>>>> On 11-Feb-2014, at 10:31 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>> 
>>>>> Yes, obviously there'll be a baseline shift when you subtract 
>>>>> background, then add an offset and log transform.
>>>>> 
>>>>> You plots do not appear to be a valid MA plots.
>>>> 
>>>> Could you please point out the error?
>>>> I understand a base line shoft is expected, but I cant figure out what
>>>> is going wrong otherwise.
>>> 
>>> Well, you manually create an MAList object from your single channel data, 
>>> even though an MAList is strictly for two colour data.
>>> 
>>> If you deceive limma as to the true nature of your data, it's not 
>>> surprising that the resulting plot might not be correct.
>>> 
>>> I am not clear why you need to make so many variations on the standard 
>>> limma single channel analysis pipeline.
>>> 
>> 
>> Is there any other way to visualise MA plots for single channel data?
>
> plotMA() already works directly on any data object:
>
>  x <- read.maimages(targets$FileName,source="genepix",green.only=TRUE)
>  plotMA(x)
>
> What could be easier than that?
>
> Gordon
>
>> 
>>> Gordon
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> Saket
>>>> 
>>>> 
>>>>> Gordon
>>>>> 
>>>>> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>>>>> 
>>>>>> Hello Gordon,
>>>>>> 
>>>>>> Is there a reason to believe the MA plots should inherently be
>>>>>> baseline shifted after normalisation?
>>>>>> 
>>>>>> Raw MA: https://db.tt/kDBod1EJ
>>>>>> background correction with 'nec': https://db.tt/0vVWeD21
>>>>>> background correction with nec followed by normalisation: 
>>>>>> https://db.tt/f0M0rWeg
>>>>>> background correction with 'normexp: https://db.tt/OJO0zea5
>>>>>> background correction with normexp followed by normalisation:
>>>>>> https://db.tt/rbLJmFBE
>>>>>> 
>>>>>> 
>>>>>> The files are a bit heavy so might take some time to load into any pdf 
>>>>>> reader.
>>>>>> 
>>>>>> Code: https://gist.github.com/saketkc/8931951
>>>>>> 
>>>>>> Saket
>>>>>> 
>>>>>> On 9 February 2014 20:45, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>>> Related question: Similar to your case, my final topTable()'s output
>>>>>>> indicates  some genes having a negative logFC, though literature
>>>>>>> expects them to have a positive logFC.
>>>>>>> 
>>>>>>> I looked up the calculations and the transition from positive to
>>>>>>> negative logFC for these genes seems to happen after the
>>>>>>> normalizeBetweenArrays step (irrespective of the kind of normalisation
>>>>>>> I choose).
>>>>>>> 
>>>>>>> This is a naive question again, but I am trying to understand what 
>>>>>>> should be
>>>>>>> a good metric to decide which method tends to give the least false
>>>>>>> positives like this, given tham I have limited knowledge of which
>>>>>>> genes should be up or down regulated(unlike in your case, where you
>>>>>>> knew the  kind  of regulation[up/down] expected).
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Saket
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 9 February 2014 04:00, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>> 
>>>>>>>> On Sat, 8 Feb 2014, Saket Choudhary wrote:
>>>>>>>> 
>>>>>>>>> Hello Gordon,
>>>>>>>>> 
>>>>>>>>> I had a chance to go through the paper. I have a set of negative and
>>>>>>>>> positive controls, arising out of single channel Genepix platform.
>>>>>>>>> From what I could gather, 'nec' method in limma performs
>>>>>>>>> backgroundcorrection using these negative control spots.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Yes, but the negative controls are assumed to behave exactly like 
>>>>>>>> probes for
>>>>>>>> unexpressed genes.  This is true for Illumina Beadchips, but is often 
>>>>>>>> not
>>>>>>>> the case for other platforms.  If not, then you would be better to 
>>>>>>>> stick
>>>>>>>> with normexp as you are already using.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> However one of the inputs to 'nec' is also "detection.p", which the
>>>>>>>>> .gprs don't have.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> detection.p is not a required argument.  It is used only when 
>>>>>>>> negative
>>>>>>>> controls are not available.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> I could simply take a mean of all the negative controls E and Eb, 
>>>>>>>>> and
>>>>>>>>> subtract it from each probe's E&Eb, doing it for all the arrays. 
>>>>>>>>> Would
>>>>>>>>> this mimic what I want to acheive with the 'nec' function?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> No, that naive approach is not equivalent and typically performs 
>>>>>>>> poorly.
>>>>>>>> 
>>>>>>>> Gordon
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Saket
>>>>>>>>> 
>>>>>>>>> On 6 February 2014 13:04, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hello Gordon,
>>>>>>>>>> 
>>>>>>>>>> Unfortunately I do not have access to this as of now. I will 
>>>>>>>>>> however
>>>>>>>>>> get hold of it soon.
>>>>>>>>>> 
>>>>>>>>>> After implementing this, I would expect the 'CONTROL' to have 
>>>>>>>>>> similar,
>>>>>>>>>> if not same values, right?
>>>>>>>>>> 
>>>>>>>>>> However some of the values for these Control genes after the
>>>>>>>>>> normalisebetweenarray step have high variance. Is this behaviour
>>>>>>>>>> normal or am I missing something?
>>>>>>>>>> 
>>>>>>>>>> Saket
>>>>>>>>>> 
>>>>>>>>>> On 6 February 2014 06:32, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> If 'x' is your background-corrected EList, then
>>>>>>>>>>> 
>>>>>>>>>>> w <- rep(1,nrow(x))
>>>>>>>>>>> w[controls] <- 100
>>>>>>>>>>> y <- normalizeBetweenArrays(x, method="cyclicloess", weights=w)
>>>>>>>>>>> 
>>>>>>>>>>> does what you want.
>>>>>>>>>>> 
>>>>>>>>>>> For an example of this approach:
>>>>>>>>>>> 
>>>>>>>>>>> http://rnajournal.cshlp.org/content/19/7/876
>>>>>>>>>>> 
>>>>>>>>>>> Best wishes
>>>>>>>>>>> Gordon
>>>>>>>>>>> 
>>>>>>>>>>> --------- original message ----------
>>>>>>>>>>> Saket Choudhary saketkc at gmail.com
>>>>>>>>>>> Thu Feb 6 06:59:42 CET 2014
>>>>>>>>>>> 
>>>>>>>>>>> I am analysing a proteomics microarray data set for a two group
>>>>>>>>>>> sample(Normal and Disease) using single color channel. The arrays 
>>>>>>>>>>> have a
>>>>>>>>>>> set
>>>>>>>>>>> of pre-defined CONTROL points whose expression levels are supposed 
>>>>>>>>>>> to be
>>>>>>>>>>> similar/same across all the arrays.
>>>>>>>>>>> 
>>>>>>>>>>> I would like to 'normalise' the levels of all probes such that
>>>>>>>>>>> normalisation
>>>>>>>>>>> ends up with all CONTROL points having similar expression levels. 
>>>>>>>>>>> If I
>>>>>>>>>>> understand it right, normalizebetweenarray does not allow this 
>>>>>>>>>>> kind of
>>>>>>>>>>> normalisation.
>>>>>>>>>>> 
>>>>>>>>>>> Is there a pre-implemented function to do this? If not, what would 
>>>>>>>>>>> be a
>>>>>>>>>>> way
>>>>>>>>>>> to acheive this kind of normalisation?
>>>>>>>>>>> 
>>>>>>>>>>> Code: https://gist.github.com/saketkc/8669586

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}