[BioC] limma Question: Design matrix for imperfectly paired samples
Gordon Smyth
smyth at wehi.EDU.AU
Sat May 12 09:23:13 CEST 2007
>Message: 21
>Date: Thu, 10 May 2007 14:10:58 -0400
>From: Xia Han <xia.han at bms.com>
>Subject: [BioC] limma Question: Design matrix for imperfectly paired
> samples
>To: bioconductor at stat.math.ethz.ch
>Message-ID: <46436032.5070102 at bms.com>
>Content-Type: text/plain
>
>Dear List:
>
>I'm using limma package on an affy dataset. I have pre- and post-
>treatment biopsies, and some biopsies came from different regions of
>tumors and were profiled more than once, e.g. sample 01_01_Post (so they
>should be technical replicates?). There are some samples missing pre- or
>post- counterparts. The phenoData likes this:
>
> Subject Treatment Subject.Treat Pairs
>1 01_01 Pre 01_01_Pre 1
>2 01_01 Post 01_01_Post 1
>3 01_01 Post 01_01_Post 1
>4 01_02 Pre 01_02_Pre NA
>5 02_01 Pre 02_01_Pre 2
>6 02_01 Post 02_01_Post 2
>7 02_02 Pre 02_02_Pre 3
>8 02_02 Post 02_02_Post 3
>9 02_03 Pre 02_03_Pre NA
>10 03_01 Pre 03_01_Pre 4
>11 03_01 Post 03_01_Post 4
>12 03_02 Pre 03_02_Pre 5
>13 03_02 Pre 03_02_Pre 5
>14 03_02 Post 03_02_Post 5
>15 03_03 Post 03_03_Post NA
>16 03_04 Pre 03_04_Pre 6
>17 03_04 Pre 03_04_Pre 6
>18 03_04 Post 03_04_Post 6
>19 03_05 Pre 03_05_Pre 7
>20 03_05 Post 03_05_Post 7
>
>
>The purpose is to detect treatment effects with more statistical power.
>I have 2 ways of building matrix as below:
>/==================================
>#First possibility:
>//design <- model.matrix(Treatment)
>corfit <- duplicateCorrelation(data, design=design, ndups=1, block=Subject)
>fit <- lmFit(data, design, block=Subject, cor=corfit$consensus)
>fit <- eBayes(fit)
>topTable(fit, coef=2)///
What's wrong with this? Apparently you have abandoned this approach,
but you don't say why.
>==========================================
>I was trying to use block=Subject.Treat for corfit, but got an error
>message of " Too much damping -convergence tolerance not achievable in:
>glmgam.fit(.....)". I can only run with block=Subject for corfit. Can
>somebody explain this to me?
>==========================================
Firstly, using Subject.Treat as the blocking variable is incorrect,
because Treat is the treatment of interest and must be put in the
design matrix not used as part of the blocking.
Secondly, there are errors and there are warnings. What you have is a
warning, not an error. An error is when R says "Error" and the
execution stops. A warning is when R says "Warning" and the execution
continues without interruption.
The fact that this warning message is not a reason for concern is
explained on the help page for duplicateCorrelation(). Read it! A
search through the Bioconductor mailing list archive for "Too much
damping", would also reveal a discussion telling you the same thing.
//#Second possibility: //
>phenoData.pairs <- subset(phenoData, Pairs!="NA")
>data.pairs <- data[ ,row.names(phenoData.pairs)]
>design <- model.matrix(~phenoData.pairs$Pairs + phenoData.pairs$Treatment)
>fit <- lmFit(data.pairs, design)
>fit <- eBayes(fit)
>topTable(fit, coef=3)/
>=========================================
>The above code would ignore data from non-paired samples. Does it
>consider the technical replicates for some of the samples?
Yes, it does use the technical replicate data. How could it not,
since the technical replicates are included in your dataset? From a
hypothesis testing point of view, this approach is a somewhat
over-optimistic, because you would be testing against technical variation only.
Best wishes
Gordon
>Thanks,
>Xia
More information about the Bioconductor
mailing list