[BioC] Using limma to identify differentially expressed genes
Ovokeraye Achinike-Oduaran
ovokeraye at gmail.com
Wed Apr 11 13:28:12 CEST 2012
Hi Jim,
I digress a bit (sorry David). But I was looking at your code
combining both getGEO and getGEOSuppFiles. The analysis you did I'm
guessing is based on the raw files because you had to read in an
affybatch object. I'm having a challenge with making my covdesc.txt
file to work for me with read.affy(), so I'm wondering if your
combination is a way to retrieve the phenotypic data without having to
manually create the text file. In other words, my question is: does
this combination of getGEO() and getGEOSuppFiles() make it possible to
boycott the use of the manually created covdesc.txt file in
read.affy()?
Thanks.
Avoks
On Tue, Apr 10, 2012 at 5:14 PM, David Westergaard <david at harsk.dk> wrote:
> Hello,
>
> I've been trying to use limma to identify genes from the following
> data: http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-21340 -
> It's a simple control vs. disease experiment
>
>
> # SDRF downloaded from above page
> SDRF <- read.table(file="E-GEOD-21340.sdrf.txt",header=TRUE,stringsAsFactors=FALSE,sep="\t")
>
> # Looking to compare Family-history negativer versus Diabetis
> controls <- SDRF[grep("Control, Family History
> Neg",SDRF$Comment..Sample_source_name.),]
> disease <- SDRF[grep("^DM",SDRF$Characteristics.DiseaseState.),]
> Batch <- rbind(controls,disease)
>
> # Read in CEL files
> mixture.batch <- ReadAffy(filenames=Batch$Array.Data.File)
>
> # Preprocess data
> mixture.processed <- expresso(mixture.batch, bgcorrect.method = "rma",
> normalize.method = "quantiles", pmcorrect.method = "pmonly",
> summary.method = "medianpolish")
>
> # Get data in matrix
> signals <- exprs(mixture.prepared)
> cl <- ifelse(colnames(signals) %in% disease$Array.Data.File,1,0)
>
> # Do design matrix and fit
> design <- model.matrix(~factor(cl))
> fit <- lmFit(signals,design)
> fit <- eBayes(fit)
> topTable(fit2,coef=2)
>
> Which yields the following:
> ID logFC AveExpr t P.Value adj.P.Val B
> 7513 208004_at -0.323 5.43 -4.65 0.00191 0.999 -3.10
> 11225 211829_s_at 0.340 5.07 4.36 0.00278 0.999 -3.17
> 5950 206424_at -0.907 6.65 -4.15 0.00363 0.999 -3.23
> 1354 201826_s_at -0.447 8.37 -4.13 0.00374 0.999 -3.24
> 19782 220418_at 0.392 5.43 4.02 0.00431 0.999 -3.27
> 8889 209396_s_at 1.899 7.47 4.01 0.00437 0.999 -3.28
> 5005 205478_at -0.931 9.22 -3.94 0.00481 0.999 -3.30
> 9469 209983_s_at 0.412 5.72 3.92 0.00492 0.999 -3.31
> 2936 203409_at 0.506 6.93 3.87 0.00531 0.999 -3.32
> 5054 205527_s_at 0.331 6.80 3.84 0.00549 0.999 -3.33
>
> I'm abit puzzled over the adjusted P-values. Can it really be true
> that ALL of the adjusted P-values are 0.999, or did I make a rookie
> mistake somewhere?
>
> Best Regards,
> David Westergaard
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list