[BioC] Limma: how to extract mean of groups?

Hooiveld, Guido Guido.Hooiveld at wur.nl
Mon Jul 6 23:18:27 CEST 2009


Hi James,

OK, I got it. :)

Let me, as biologist, rephrase what you said (correct me if I am wrong):
(email is also for the archive, since I do know some aspects of the
issue I had are also faced by fellow biologists, who, like me, are less
trained in statistics).
    

- key words are 'group-means parameterization'.
- my current design is: design <- model.matrix(~0+TS). The term '~0'
indictates that no intercept is calculated. [Thus I was on the correct
track... ;)]
- I defined these two contrasts: cont.matrix <-
makeContrasts(WTwyvWTc=WT.WY-WT.Con, KOwyvKOc=KO.WY-KO.Con,
levels=design). Doing so, for each of the two strains, I only tested for
the difference between treatment (WY) and control. The coefficients
reflect the difference between the treatment and control, which equals
the log2 of the FC. This is what I had seen before:
 
> fit2
An object of class "MArrayLM"
$coefficients
              Contrasts
                   WTwyvWTc     KOwyvKOc
  100009600_at   -0.02321959    0.1853882
  100012_at      -0.11743549   -0.2804115
  100017_at       0.41806549    0.2506005
  100019_at       0.44521455    0.1060852
  100034251_at   -0.39616238    0.1285417
11512 more rows ...


- After your reply I redefined my contrast to explicitly state the first
2 experimental groups:  cont.matrix <- makeContrasts(WTWY=WT.WY,
WTCon=WT.Con, WTwyvWTc=WT.WY-WT.Con, KOwyvKOc=KO.WY-KO.Con,
levels=design). Since no intercept has been defined, I now also test
whether the expression of a gene in the WT-WY and WT-control groups is
significanty different compared to....? Compared to what I don't know
(overall mean, or zero??), but in my case this is not relevant. In the
same run I also test for the difference between treatment and control,
like I did before. The nice thing now is that the first two coeffecient
represent the average expression of the WT.WY and WT.Con groups, the
last two coefficients (again) the log2 of the FC between treatment vs
control.

> fit2
An object of class "MArrayLM"
$coefficients
              Contrasts
                  WT.WY  WT.Con      WTwyvWTc     KOwyvKOc
  100009600_at 2.418141 2.486590   -0.02321959    0.1853882
  100012_at    1.992032 2.058794   -0.11743549   -0.2804115
  100017_at    6.804809 6.888234    0.41806549    0.2506005
  100019_at    3.028478 3.137888    0.44521455    0.1060852
  100034251_at 4.182340 6.298939   -0.39616238    0.1285417
11512 more rows ...
     
- However, regarding the standard deviation I am worried about all
identical values that are observed for 5 different probesets within a
same group. I have the feeling the stdev.unscaled are actually not the
'real' SDs, but that I have to further manipulate the unscaled stdevs.
Thus, are the identical results below to be expected? If not, how to get
the standard deviation instead?

$stdev.unscaled
              Contrasts
                  WT.WY    WT.Con      WTwyvWTc     KOwyvKOc
  100009600_at 0.5491619 0.478989     0.8439029    0.7600834
  100012_at    0.5491619 0.478989     0.8439029    0.7600834
  100017_at    0.5491619 0.478989     0.8439029    0.7600834
  100019_at    0.5491619 0.478989     0.8439029    0.7600834
  100034251_at 0.5491619 0.478989     0.8439029    0.7600834
11512 more rows ... 
 

Thanks & regards,
Guido

> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu] 
> Sent: 06 July 2009 21:45
> To: Hooiveld, Guido
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Limma: how to extract mean of groups?
> 
> Hi Guido,
> 
> Hooiveld, Guido wrote:
> > Hi James,
> > 
> > Thanks. I indeed had seen that before, but I had the impression the 
> > 'coefficients' actually are the log2 of the fold change...?!?
> > Anywayz, I'll have a closer look at it to make sure I understand it 
> > correctly.
> 
> It depends on how you set up the design matrix. If you have 
> an intercept, then the coefficients will be the difference 
> between a baseline level and whatever level you are looking 
> at. However, if you don't have an intercept, then the 
> coefficients represent the group means.
> 
> Best,
> 
> Jim
> 
> 
> > 
> > Best,
> > Guido
> > 
> > 
> >  
> > 
> >> -----Original Message-----
> >> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
> >> Sent: 06 July 2009 18:50
> >> To: Hooiveld, Guido
> >> Cc: bioconductor at stat.math.ethz.ch
> >> Subject: Re: [BioC] Limma: how to extract mean of groups?
> >>
> >> Hi Guido,
> >>
> >> Hooiveld, Guido wrote:
> >>> Dear list,
> >>>  
> >>> Is it possible to extract the group means + SD from 
> Limma's output?
> >>>  
> >>> I do know that the "Amean" can be extracted (fit2$Amean),
> >> but this is
> >>> the mean across all arrays/groups, and i am also 
> interested in the 
> >>> means of the experimental groups.
> >> Have you read the relevant help page (hint ?MArrayLM-class)?
> >>
> >> I get
> >>
> >> Slots/Components:
> >>
> >>       'MArrayLM' objects do not contain any slots (apart 
> from '.Data')
> >>       but they should contain the following list components:
> >>
> >>       'coefficients': 'matrix' containing fitted coefficients or
> >>            contrasts
> >>
> >>       'stdev.unscaled': 'matrix' containing unscaled standard 
> >> deviations
> >>            of the coefficients or contrasts
> >>
> >> Which appears to be what you want.
> >>
> >> Best,
> >>
> >> Jim
> >>
> >>
> >>> Therefore, is it somehow possible to extact the mean (+ 
> SD) of the 
> >>> groups that are defined by the contrast matrix? Thus in my
> >> particular
> >>> case the mean + SD of the WT.Con, WT.WY. KO.Con and KO.WY groups.
> >>>  
> >>> Thanks,
> >>> Guido
> >>>  
> >>>  
> >>> library(affy)
> >>> library(limma)
> >>>  
> >>> targets <- readTargets("targets.txt") data <- 
> >>> ReadAffy(filenames=targets$FileName)
> >>> x.norm <- rma(data)
> >>>
> >>> TS <- paste(targets$Strain, targets$Treatment, sep=".") TS <- 
> >>> factor(TS, levels=c("WT.Con","WT.WY","KO.Con","KO.WY"))
> >>> design <- model.matrix(~0+TS)
> >>> colnames(design) <- levels(TS)
> >>> fit <- lmFit(eset, design)
> >>> cont.matrix <- makeContrasts(WTwyvWTc=WT.WY-WT.Con,
> >>> KOwyvKOc=KO.WY-KO.Con, levels=design)
> >>> fit2 <- contrasts.fit(fit, cont.matrix)
> >>> fit2 <- eBayes(fit2)
> >>>
> >>> ------------------------------------------------
> >>> Guido Hooiveld, PhD
> >>> Nutrition, Metabolism & Genomics Group Division of Human 
> Nutrition 
> >>> Wageningen University Biotechnion, Bomenweg 2
> >>> NL-6703 HD Wageningen
> >>> the Netherlands
> >>> tel: (+)31 317 485788
> >>> fax: (+)31 317 483342 
> >>> internet:   http://nutrigene.4t.com <http://nutrigene.4t.com/>  
> >>> email:      guido.hooiveld at wur.nl 
> >>>
> >>>
> >>>
> >>> 	[[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives: 
> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> --
> >> James W. MacDonald, M.S.
> >> Biostatistician
> >> Douglas Lab
> >> University of Michigan
> >> Department of Human Genetics
> >> 5912 Buhl
> >> 1241 E. Catherine St.
> >> Ann Arbor MI 48109-5618
> >> 734-615-7826
> >>
> >>
> > 
> 
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> 
> 



More information about the Bioconductor mailing list