[BioC] ANOVA test after VST in DESeq
Liu, XiaoChuan
xiaochuan.liu at mssm.edu
Thu Mar 14 16:57:29 CET 2013
Dear Simon,
Here are my code for ANOVA test after VST in DESeq, Do you think it is right to do ANOVA test by VST? Thanks!
Code:
CountTable <- read.table("all.samples.count.txt",header=TRUE, row.names=1)
library('DESeq')
Design<-data.frame(row.names=colnames(CountTable), condition = c("Wildtype","Wildtype","Knockout","Knockout","Wildtype","Wildtype","Knockout","Knockout"), treatment = c("Picrotoxin","Control","Picrotoxin","Control","Picrotoxin","Control","Picrotoxin","Control"))
cds <- newCountDataSet( CountTable, Design )
cds <- estimateSizeFactors( cds )
cds <- estimateDispersions( cds )
x<-getVarianceStabilizedData(cds)
CountTable_new <- x
nrow_number<-nrow(CountTable_new)
ncol_number<-ncol(CountTable_new)
all_P_value<-0;
for(i in 1:nrow_number){
t1<-as.matrix(Design$condition)
t2<-as.matrix(Design$treatment)
t3<-as.matrix(CountTable_new[i,])
anovaTable<-matrix(c(t1,t2,t3),nrow=ncol_number,ncol=3,dimnames = list(c(1:ncol_number),c("condition","treatment","expression")))
anovaTable_data<-as.data.frame(anovaTable)
anovaTable_data$expression<-as.integer(anovaTable_data$expression)
int <- aov(expression ~ condition*treatment, data= anovaTable_data)
int_matrix<-as.matrix(anova(int))
p_value=int_matrix[3,5]
all_P_value<-c(all_P_value,p_value)
}
all_P_value<-all_P_value[2:(nrow_number+1)]
all_P_value<-as.matrix(all_P_value)
anova<-matrix(c(CountTable_new,all_P_value,all_P_value),nrow=nrow_number,ncol=(ncol_number+2),dimnames = list(rownames(CountTable_new),c(colnames(CountTable_new),"p_value","q_value")))
anova<-as.data.frame(anova)
anova$q_value<-p.adjust(as.numeric(anova$p_value),method = "BH")
output_file_name="two_way_ANOVA.txt"
write.table( anova, file=output_file_name, quote= FALSE, sep="\t", row.names= TRUE )
Leo
>3. In Part 3, I also do the overlap with 6 results in Part 2.
> But the overlap are very small. I wonder if I make a mistake to
> misuse the variance stabilizing transformation? If I want to directly
> use the ANOVA function in R to calculate co-factor P-value, could I
> use the raw count? Or How to normalize the raw counts then I can use
> ANOVA function in R?
No, ordinary-least-square (OLS) ANOVA requires data to be homoscedastic
and this count data is not. This is, after all, the whole point of
either using GLMs on the raw count data, or OLS ANOVA on
variance-stabilized data.
I would have expected some similarity between the results of a GLM
ANODEV anaysis of the count data and the OLS ANOVA analysis on the vST
data, but as you did not post the code you used, it is hard to say
whether you may have made a mistake.
Simon
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list