[BioC] single color using Limma : very low p-value "1.40456052121752e-60"
Sean Davis
sdavis2 at mail.nih.gov
Wed Aug 10 19:21:00 CEST 2011
I didn't go through the code in great detail, but since these are
1-color data, make sure that you are not testing that the intercept is
different than 0 (and it looks like you probably are).
Sean
On Wed, Aug 10, 2011 at 5:11 AM, bigoun <bigounn at yahoo.fr> wrote:
> Dear all,
>
> I'm analysing data single color data (only intensities at F532) using limma.
>
> I have 14 arrays in one group and 5 in the other group. I want to compare them
> so my targets file is as follows
> FileName Mal Sain
> M2.gpr 1 0
> M3.gpr 1 0
> M4.gpr 1 0
> M5.gpr 1 0
> M6.gpr 1 0
> M7.gpr 1 0
> M8.gpr 1 0
> M9.gpr 1 0
> M10.gpr 1 0
> M11.gpr 1 0
> M12.gpr 1 0
> M13.gpr 1 0
> M14.gpr 1 0
> M15.gpr 1 0
> S1.gpr 0 1
> S2.gpr 0 1
> S3.gpr 0 1
> S4.gpr 0 1
> S5.gpr 0 1
>
> I use some correction and normalisation methods as recommanded in limma user
> guide. Afterthat, I make my design and as I have 2 technical replicate (2 spot
> for the same miRNA) for each array, I wanted to compute duplicatecorrelation
> coefficient
>
>
> .GAL file as follows :
> ATF 1
> 52 5
> Type=GenePix ArrayList V1.0
> BlockCount=48
> BlockType=0
> URL=http://genome-www4.stanford.edu/cgi-bin/SGD/locus.pl?locus=[ID]
> "Block1=2645, 2295, 100, 12, 300, 8, 300"
> "Block2=7145, 2295, 100, 12, 300, 8, 300"
> "Block3=11555, 2235, 100, 12, 300, 8, 300"
> "Block4=16055, 2235, 100, 12, 300, 8, 300"
>
>
> The corfit$consensus is not good (0,36). Do I still use this parameters for
> lmfit???
>
> Anyway, the more strange is with or without this parameters I have p-adjust
> value very very low.
>
> first line of my result :
> Block Row Column ID Name X.Intercept. factor.pData.population.sain AveExpr F P.Value adj.P.Val
>
> 39 2 10 42554 hsa-miR-923 13.4125293049973 -0.0517387609947947 13.3989138415776 25264.4487822867 1.20563134868457e-63 1.40456052121752e-60
>
>
> Is anybody help me and what is wrong.
>
> As you can see in UNFILTERED section, I have less than 50% genes for the
> analysis. Could it be the explanation of these strange result??
>
> Thanks in advance
>
> My srcipt:
> library(limma)
> myFilter = function(X) {
> H_threshold=2
> okFLAG = X$Flags > -49;
> okm1 =abs(X[,"F532 Median"]-X[,"F532 Mean"])
> okm2 = 0.5*(X[,"F532 Median"]+X[,"F532 Mean"])
> okH = ((okm1)/okm2) <H_threshold
> as.numeric(okFLAG & okH)}
>
> targets=readTargets(file="targets.txt", path=NULL, sep="\t")
>
> E =read.maimages(targets,source="genepix",wt.fun=myFilter,columns=list(E="F532
> Mean",Eb="B532 Mean",Names="Name"))
>
> #Taux de filtrage
> unFiltered = apply(E$weights,MARGIN=2,FUN=mean)
> round(unFiltered,2)
>
> # M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 S1 S2
>
> #0.41 0.43 0.34 0.29 0.51 0.35 0.48 0.35 0.29 0.56 0.38 0.56 0.61 0.52 0.48 0.34
>
> # S3 S4 S5
> #0.36 0.46 0.47
>
> #Boite a moustache avant correction
> boxplot(as.data.frame(E$E),main="Mean intensities")
>
> #Correction background - normexp
> Enorm <- backgroundCorrect(E, method="normexp",offset=1)
> boxplot(as.data.frame(Enorm$E), main="Mean intensities - normexp")
> #Normalisation quantile
> NormE <- normalizeBetweenArrays(Enorm,method="quantile")
> boxplot(as.data.frame(NormE$E), main="Normalized intensities")
>
> NormE$E <- log2(NormE$E)
>
> pData <- data.frame(population = c('mal','mal', 'mal',
> 'mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','mal','sain','sain','sain','sain','sain'))
>
>
> # create design matrix
> design <- model.matrix(~factor(pData$population))
>
> # In my .gpr files all miRNAs contain the string "miR" in their name
> # so the grep function can be used to extract all of these, removing
> # all control signals and printing buffers etc.
> # You need to check your .gpr files to find which signals you should extract.
> miRs <- c(grep("-miR-", NormE$genes$Name), grep("-let-", NormE$genes$Name))
> NormE.final <- NormE[miRs, ]
> NormE.final <-NormE.final[order(NormE.final$genes$Name), ]
> NormE.final$genes$Name
>
> # calculate duplicate correlation between the 2 replicates on each array
> corfit <- duplicateCorrelation(NormE.final, design, ndups=2)
> corfit$consensus
> # show a boxplot of the correlations
> boxplot(tanh(corfit$atanh.correlations))
>
> # fit the linear model, including info on duplicates and correlation
> fit <- lmFit(NormE.final, design, ndups=2, correlation=corfit$consensus)
> #fit <- lmFit(NormE.final, design)
> # calculate values using ebayes
> ebayes <- eBayes(fit)
>
> # output a list of top differnetially expressed genes...
> result=topTable(ebayes,number=4608, p=0.05,adjust = "BH")
> genes=result$Status=="genes"
> write.table(result,"gene.txt",quote=FALSE,sep="\t",row.names=FALSE,
> col.names=TRUE)
>
>
> mikel
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list