[BioC] 'Factorial Design' package question

Martin Morgan mtmorgan at fhcrc.org
Thu Jun 5 23:49:10 CEST 2008


Hi Galina --

Glazko, Galina wrote:
> Hello, 
> 
>  
> 
> I apologize if my question is too simple, still I would appreciate an
> advice.
> 
> I am using 'Factorial Design' package for Affy Drosophila 2.0 array.
> 
> I read *.cel files using standard 'rma' way: 
>> dat<-ReadAffy()
>> eset <- rma(dat)
> and created design for ANOVA separately in 'design.txt' file
> (I have 3 populations and 2 conditions I want to analyze).

[SNIP]

> #-------------------------------
> Now, to use 'esApply' with 'lm' function as described in 'Estrogen 2x2 
> factorial design' example I created a pData for eset:
> pData <-read.table(file="design.txt",sep='\t',row.names=1,header=TRUE)
> pData(eset)<-pData;
> #-------------------------------
> My question actually is: if I look now into eset object - 

[SNIP]

> #-------------------------------------------
> 
> I see that  
> 
>     POP: arbitrary numbering
>     ET: arbitrary numbering
> 
> - Does it mean that factors' levels (such as POP: E1_2, M1_2, R1_2 and
> ET: E, W) are incorrectly assigned in eset object?

R is representing POP and ET as factors. A factor is an integer vector,
with a 'map' that describes what text label is associated with each
integer. The 'arbitrary numbering' is saying that there is no particular
ordering to the map, e.g., you might have either of these

> x1=factor(c("E", "E", "W", "W"), levels=c("E", "W"))
> x2=factor(c("E", "E", "W", "W"), levels=c("W", "E"))
> x1
[1] E E W W
Levels: E W
> x2
[1] E E W W
Levels: W E

x1 has encoded "E" as the number 1
> dput(x1)
structure(c(1L, 1L, 2L, 2L), .Label = c("E", "W"), class = "factor")

x2 has encoded "E" as the number 2
> dput(x2)
structure(c(2L, 2L, 1L, 1L), .Label = c("W", "E"), class = "factor")

The encoding is consistent within x1 or x2 factor, but different between
x1 and x2. No cause for alarm in this case.

> And another question, probably related: I had 3 levels for the factor
> 'POP' in "design.txt" however looking in the output of lm, I can see
> only two levels - 
> 
> Call:
> 
> lm(formula = y ~ ET + POP + ET:POP)
> 
> Coefficients:
> 
> (Intercept)          ETW      POPM1_2      POPR1_2  ETW:POPM1_2
> ETW:POPR1_2  
> 
> Something is definitely wrong.

R is using  something called treatment contrasts. It's chosen one level
of each factor as a 'base' for comparison ('E' for ET, 'E1_2' for POP)
and is expressing the fit of the model in terms of deviations from this
base. You're probably expecting a table based on deviations from an
overall mean. Probably time for a sit-down with a statistician friend,
especially one familiar with R, to work through this; parts of the
'limma' user manual are useful, too, as are several of the intro /
not-so-intro R books.

Martin

> 
> And, if so, what is the correct way?
> 
>  
> 
> Best regards
> 
> Galina 
> 
>> sessionInfo()
> 
> R version 2.5.1 (2007-06-27) 
> 
> i386-pc-mingw32 
> 
>  
> 
> locale:
> 
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> 
>  
> 
> attached base packages:
> 
> [1] "splines"   "tools"     "stats"     "graphics"  "grDevices" "utils"
> "datasets"  "methods"   "base"     
> 
>  
> 
> other attached packages:
> 
>        factDesign        biomaRt          RCurl            XML
> RColorBrewer    drosophila2 
> 
>        "1.10.0"       "1.10.1"        "0.8-0"        "1.9-0"
> "1.0-1"       "1.16.0" 
> 
>       multtest       survival drosophila2cdf           affy
> affyio        Biobase 
> 
>       "1.16.1"         "2.32"       "1.16.0"       "1.14.2"
> "1.4.1"       "1.14.1"
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list