[BioC] Siggenes SAM analysis: log2 transformation and Understanding output
David Westergaard
david at harsk.dk
Wed Feb 15 14:30:19 CET 2012
Hello,
I am currently working on a data set about kiwi consumption for my
bachelors project. The data is available at
http://www.ebi.ac.uk/arrayexpress/experiments/E-MEXP-2030
I'm abit confused as to how to interpret the output parameters,
specifically p0. I've run the following code:
dataset <- read.table("OAS_RMA.txt",header=TRUE)
controls <- cbind(dataset$CEL12.1,dataset$CEL13.1,dataset$CEL23.1,dataset$CEL25.1,dataset$CEL37.1,dataset$CEL59.1,dataset$CEL61.1,dataset$CEL78.1,dataset$CEL9.1,dataset$CEL92.1)
experiments <- cbind(dataset$CEL18.1,dataset$CEL21.1,dataset$CEL3.1,dataset$CEL31.1,dataset$CEL46.1,dataset$CEL50.1,dataset$CEL56.1,dataset$CEL57.1,dataset$CEL7.1)
library('siggenes')
datamatrix <- matrix(cbind(controls,experiments),ncol=19)
y <- rep(0,19)
y[11:19] <- 1
gene_names <- as.character(dataset$Hybridization.REF)
sam.obj = sam(datamatrix,y,gene.names=gene_names,rand=12345)
Output:
AM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances
s0 = 0
Number of permutations: 100
MEAN number of falsely called variables is computed.
Delta p0 False Called FDR cutlow cutup j2 j1
1 0.1 0.634 28335.89 37013 0.4851 -1.058 0.354 9709 27372
2 0.5 0.634 11200.82 21273 0.3336 -2.271 0.910 2447 35850
3 0.9 0.634 249.38 1522 0.1038 -3.374 3.088 541 53695
4 1.3 0.634 9.67 134 0.0457 -4.402 5.577 127 54669
5 1.7 0.634 0.69 20 0.0219 -5.596 Inf 20 54676
6 2.1 0.634 0 1 0 -9.072 Inf 1 54676
7 2.5 0.634 0 1 0 -9.072 Inf 1 54676
8 2.9 0.634 0 1 0 -9.072 Inf 1 54676
9 3.3 0.634 0 1 0 -9.072 Inf 1 54676
10 3.7 0.634 0 0 0 -Inf Inf 0 54676
I'm using the rand parameter because results seems to vary a bit. p0
is in this case 0.634, and I'm not sure how to interpret this. From
literature, this is described as "Prior probability that a gene is not
differentially expressed" - What does this exactly mean? Does this
imply, that there is a ~63% percent chance, that the genes in
question, are actually NOT differentially expressed?
I've also found some varying sources saying that it is a good idea to
log2 transform data before inputting into SAM. Does this still apply,
and if so, why?
Best Regards,
David Westergaard
Undergraduate student
Technical University of Denmark
More information about the Bioconductor
mailing list