[BioC] Limma, blocking, robust methods, and paired samples question
Gordon K Smyth
smyth at wehi.EDU.AU
Thu Nov 28 04:26:40 CET 2013
Also see ?arrayWeights. This is appropriate for dealing with outlier
samples.
In terms of specifically robust methods, limma has three approaches:
1. lmFit(...,method="robust") deals with outliers at the observation
level.
2. arrayWeights deal with outliers at the sample level.
3. eBayes(robust=TRUE) deals with outliers at the gene level.
In our own publications, we have used (2) most, then (3) and (1) only
occasionally.
Gordon
On Thu, 28 Nov 2013, Gordon K Smyth wrote:
> Dear Gustavo,
>
> Your email raises a lot of different issues, so I will try to answer a few
> things in turn.
>
> First, you appear to have paired samples, so you should analyse the data as a
> paired design. There is no further information to be extracted by using
> random effects for a paired design.
>
> You have tried to combine a random effect with method="robust" but limma does
> not support this. If you use method="robust", then the correlation will be
> ignored and treated as zero.
>
> You say that the consensus correlation is 1, but your output actually shows
> it to be 0.72, and as mentioned above it has been treated as zero in the
> analysis.
>
> I suggest that you stick to the usual paired analysis. As an alternative to
> lmFit(...,method="robust"), you could also try
> eBayes(robust=TRUE,trend=TRUE). I also suggest that you make plots of your
> data to see what is going on rather than just trying out different analyses
> to see what they do. For example
>
> plotSA(fit)
>
> is likely to be useful.
>
> Best wishes
> Gordon
>
>
>> Date: Tue, 26 Nov 2013 18:47:53 +0100
>> From: Gustavo Fernandez Bayon <gbayon at gmail.com>
>> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
>> Subject: [BioC] Limma, blocking, robust methods, and paired samples
>> question
>>
>> Hello everybody.
>>
>> I am experiencing some problems with a methylation microarray experiment I
>> am trying to analyze and, although I am able to get results to some extent,
>> I am suspicious about them and would like to know if I am doing things
>> correctly.
>>
>> We have 12 samples. Each sample is built on DNA coming from a single cell.
>> The samples can be of two types: FLO- or FLO+. We have two samples, one
>> FLO- and one FLO+, for every type of primary tumor we are working on. So
>> the experiment could be described as follows:
>>
>> Tumor_Name Sample_Group
>> <character> <character>
>> 9296931014_R01C01 185_8-6-11 FLO-
>> 9296931014_R02C01 185_8-6-11 FLO+
>> 9296931014_R03C01 185_SCD1S FLO-
>> 9296931014_R04C01 185_SCD1S FLO+
>> 9296931014_R05C01 185_SCD2S FLO-
>> ... ... ...
>> 9296931014_R02C02 185_4/12 FLO+
>> 9296931014_R03C02 A6L_8-1-10 FLO-
>> 9296931014_R04C02 A6L_8-1-10 FLO+
>> 9296931014_R05C02 A6L_23-2-10 FLO-
>> 9296931014_R06C02 A6L_23-2-10 FLO+
>>
>> We want to find the differentially methylated probes between FLO- and FLO+.
>> If we try to fit a simple model (~ Sample_Group), we are not able to get
>> any useful result. I thought that this could be caused by the among-tumor
>> differences, so I decided to model the experiment as a paired samples test,
>> and did the following:
>>
>> design <- model.matrix(~ Sample_Group + Tumor_Name, data=pdata)
>> fit <- lmFit(mvalues, design)
>> efit <- eBayes(fit)
>> topTable(efit, coef=2)
>>
>> Unfortunately, no probe had an adjusted p-value under our significance
>> level. So I kept searching for methods to overcome this.
>>
>> Lately, we have been playing around with robust methods, because we think
>> they could suit well to microarray problems, where outliers (specially in
>> those probes with SNPs nearby) could account for much variability. So I
>> decided to do the following:
>>
>> design <- model.matrix(~ Sample_Group + Tumor_Name, data=pdata)
>> fit <- lmFit(mvalues, design, method='robust')
>> efit <- eBayes(fit)
>> topTable(efit, coef=2)
>>
>> logFC AveExpr t P.Value
>> adj.P.Val B
>> cg03143457 1.1528608 -2.521458 20.10818 3.293139e-08 0.01415556
>> 7.525073
>> ch.4.162336241R 1.1927542 -3.956192 16.62275 1.491918e-07 0.03076082
>> 6.622935
>> cg23320987 -0.7800961 3.997052 -14.76658 3.791254e-07 0.03076082
>> 6.201241
>> ...
>>
>> In this case, I am able to get more than 9000 significant probes. This
>> could suffice, but I started reading discussions in this list about limma
>> being able to model random effects, and decided to see what could happen if
>> I modeled the origin tumor as a random effect:
>>
>> design <- model.matrix(~ Sample_Group, data=pdata)
>> cor <- duplicateCorrelation(mvalues, design, block=pdata$Tumor_Name)
>>
>>> cor$consensus.correlation
>> [1] 0.7208955
>>
>> fit <- lmFit(mvalues, design, block=pdata$Tumor_Name,
>> correlation=cor$consensus.correlation, method='robust')
>>
>> There were 50 or more warnings (use warnings() to see the first 50)
>>> warnings()
>> Warning messages:
>> 1: In rlm.default(x = X, y = y, weights = w, ...) :
>> 'rlm' failed to converge in 20 steps
>> 2: In rlm.default(x = X, y = y, weights = w, ...) :
>> ...
>>
>> efit <- eBayes(fit)
>> topTable(efit, coef=2)
>>
>>> topTable(efit, coef=2)
>> logFC AveExpr t P.Value
>> adj.P.Val B
>> cg13065299 0.6832614 4.468798 6.702811 0.000023155532 0.9999998
>> -3.510208
>> cg04211927 0.7980711 3.845420 6.906811 0.000017357590 0.9999998
>> -3.525569
>> ch.22.436090R 0.5894315 -3.633235 6.305634 0.000041228561 0.9999998
>> -3.526679
>> ...
>>
>> I was expecting a different number of probes, accounting for the increased
>> power (as I have read) of a mixed model in this case, but I actually got 0
>> significant probes. Another strange thing is that consensus correlation
>> equals 1.
>>
>> Is it possible to treat this experiment as a paired-sample test? I am
>> always doubtful when thinking about experimental designs.
>>
>> Have I done the call to duplicateCorrelation() in a correct way? I am not
>> sure if I have to include the blocking variable in the design matrix or
>> not.
>>
>> Is it possible to fit a mixed model using the robust method? Maybe I am
>> trying to do too many things at once.
>>
>> Could I trust the ~9000 probes I got from the simple, paired design?
>>
>> Thank you very much. As always, any hint or help would be much appreciated.
>>
>> Regards,
>> Gustavo
>>
>> PS: My sessionInfo:
>>
>> [[alternative HTML version deleted]]
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list