[BioC] limma single channel analysis

Na, Ren Na at uthscsa.edu
Thu Dec 2 00:07:36 CET 2004


Hi,
We have two data sets from two experiments.
We did an experiment seven months ago. which is three old mutant mice(oldmu) were compared with three old wild type mice(oldwt). Eighteen two-color arrays were used(including dye swap), and each mouse appeared on six different arrays to study differential expression between oldmu and oldwt.
oldwt1 vs oldmu1
oldwt1 vs oldmu2
oldwt1 vs oldmu3
oldwt2 vs oldmu1
oldwt2 vs oldmu2
oldwt2 vs oldmu3
oldwt3 vs oldmu1
oldwt3 vs oldmu2
oldwt3 vs oldmu3
and corresponding dye swap arrays.

Recently, we did another experiment, which is four young mutant mice(mu) were compared with four young wild type mice(wt). Eight two-color arrays were used(including dye swap), and each mouse appeared on two different arrays to study differetial expression between youngmu and youngwt.
wt1 vs mu1
wt2 vs mu2
wt3 vs mu3
wt4 vs mu4
and corresponding dye swap arrays. For this experiment, we used newly printed slides which have 200 spot missing on each slide( 16k array).

Now, we are interested in the differential expression between oldmu and mu, and I want to combine these two data sets and use single channel analysis to get significant gene list.

I did two tests:
Test1:
I picked six arrays from first data set
oldwt1 vs oldmu1
oldwt2 vs oldmu2
oldwt3 vs oldmu3
and corresponding dye swap arrays, and all arrays from second experiment.

My target file is
SlideNumber     FileName        Cy3     Cy5
1       1529.spot       wt      mu
2       1530.spot       mu      wt
3       1521.spot       wt      mu
4       1532.spot       mu      wt
5       1535.spot       wt      mu
6       1536.spot       mu      wt
7       1523.spot       wt      mu
8       1524.spot       mu      wt
9       1391.spot       oldwt   oldmu
10      1392.spot       oldmu   oldwt
11      1371.spot       oldwt   oldmu
12      1372.spot       oldmu   oldwt
13      1397.spot       oldwt   oldmu
14      1398.spot       oldmu   oldwt

I analyzed in the following way 
require(limma)
targets<-readTargets("Targets.txt")
RG<-read.maimages(targets$FileName, source="spot",wt.fun=wtarea(100))
RG$genes<-readGAL("Mouse24052004_246.txt")
RG$printer<-getLayout(RG$genes)
spottypes<-readSpotTypes("spottypes2.txt")
RG$genes$Status<-controlStatus(spottypes,RG$genes)
RG<-backgroundCorrect(RG,method="minimum")
MA<-normalizeWithinArrays(RG,method="printtiploess")
MA<-normalizeBetweenArrays(MA,method="quantile")
ind<-(MA$genes$Status %in% c("blank", "miss"))
targets.sc<-targetsA2C(targets)
design.sc<-model.matrix(~0+factor(targets.sc$Target)+factor(targets.sc$channel))
colnames(design.sc)<-c("mu","oldmu","oldwt","wt","ch")
# I subset MA to get rid of missing spots, and blank(buffer) control
corfit<-intraspotCorrelation(MA[!ind,],design.sc)
fit <- lmscFit(MA[!ind,],design.sc,correlation=corfit$consensus)
contrast.matrix<-makeContrasts(oldmu-mu,levels=design.sc)
fit2<-contrasts.fit(fit,contrast.matrix)
fit2<-eBayes(fit2)
tab<-topTable(fit2,number=400,adjust="fdr")

My questions are
1) Can I analyze like above to get significant gene list? If what I did is correct, and if I want to include all eighteen arrays from first experiment instead of six arrays, how should I modify the codes?
2) I used subset MA(get rid of missing spots) to fit the model, is it correct way to handle slide differece between two batchs of slides?

Test2

Before I did Test1, I thought I was going to get error from function intraspotCorrelation(I got error from this function before for my other data sets, but not when I did my Test1. I still use limma version 1.8.8). I tried another "weird" way like the following(target file is same as that in Test1)

require(limma)
require(convert)
targets <- readTargets("Targets.txt")
RG<-read.maimages(targets$FileName, source="spot",wt.fun=wtarea(100))
RG$genes<- readGAL("Mouse24052004_246.txt")
RG$printer<-getLayout(RG$genes)
spottypes<-readSpotTypes("spottypes2.txt")
RG$genes$Status<- controlStatus(spottypes, RG$genes)
RG<-backgroundCorrect(RG,method="minimum")
MA<-normalizeWithinArrays(RG, method="printtiploess")
MA<-normalizeBetweenArrays(MA, method="quantile")
# I convert MA to RG, then convert RG to exprSet, and then analyse like affy metrix array
RG2<-RG.MA(MA)
eset<-as(RG2,"exprSet")
geneNames(eset)<-paste(RG2$genes$ID,RG2$genes$Symbol,sep="\t")
design<-model.matrix(~ -1+factor(c(1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,3,4,4,3,3,4,4,3,3,4,4,3)))
colnames(design)<-c("youngwt","youngmu","oldwt","oldmu")
ind<-(MA$genes$Status %in% c("blank", "miss"))
fit<-lmFit(eset[!ind,],design)
contrast.matrix<-makeContrasts(oldmu-youngmu,levels=design)
fit2<-contrasts.fit(fit,contrast.matrix)
fit2<-eBayes(fit2)
tab<-topTable(fit2,number=400,adjust="fdr")

I am just wondering whether the way I did here make sense?

Thanks in advance!

Ren

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list