[R] Problem with RMA using limma, oligo and pdInfoBuilder packages

Martin Morgan mtmorgan at fhcrc.org
Tue Jan 27 18:08:14 CET 2009


Anne-Marie Madore <anne-marie.madore.1 at ulaval.ca> writes:

> Hi,
>
>  
>
> I am a Ph.D. student from Québec, Canada. I’m a beginner with R and
> Bioconductor. Until now the only experience I have is in analyzing

Please ask Bioconductor questions on the Bioconductor mailing list.

  http://bioconductor.org/docs/mailList.html

Martin

> microarray data using affy and limma packages. Now I am trying to analyze
> Rat Gene 10 st arrays and I would like to run RMA analysis and Smyth
> moderated t test on those arrays. Since no cdf official package is available
> for those arrays, after reading many of the questions and responses on this
> mailing list, I decided to use pdInfoBuilder, oligo and limma packages to
> run analysis. The problem is, at the end, I get expression and differential
> expression measured for all probe separately but not the calculated
> expression representing all probe of each gene. When I run RMA, I got only
> two steps, Background correcting and Normalizing but not Calculating
> expression. Do you know how I can get differential expression calculated for
> each gene? I don’t know if the problem is in the package I built or if I can
> use some code to answer this question. I list all codes used to build and
> install the package “pd.ragene.1.0.st.v1” and used to analyze expression
> arrays below.
>
>  
>
> Many thanks for your help,
>
>  
>
> Anne-Marie Madore
>
>  
>
>  
>
>  
>
>  
>
> ## building the package
>
>  
>
>> library(Biobase)
>
> Loading required package: tools
>
>  
>
> Welcome to Bioconductor
>
>  
>
>   Vignettes contain introductory material. To view, type
>
>   'openVignette()'. To cite Bioconductor, see
>
>   'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>  
>
>> library(pdInfoBuilder)
>
> Loading required package: RSQLite
>
> Loading required package: DBI
>
> Loading required package: affxparser
>
> Loading required package: oligo
>
> Loading required package: splines
>
> Loading required package: preprocessCore
>
> Loading required package: AnnotationDbi
>
> Loading required package: oligoClasses
>
> oligo Package - Series 1.5.x
>
>> setwd("D:/Anne-Marie/Doctorat/puces ADN macrophages/puces rat/Annie
> Dube/Analyse")
>
>> transFile <-
> "RaGene-1_0-st-v1.na27.rn4.transcript.csv1/RaGene-1_0-st-v1.na27.rn4.transcr
> ipt.csv"
>
>> probeFile <- "RaGene-1_0-st-v1.probe.tab/RaGene-1_0-st-v1.probe.tab"
>
>> clfFile <- "RaGene-1_0-st-v1.r4.clf/RaGene-1_0-st-v1.r4.clf"
>
>> pgfFile <- "RaGene-1_0-st-v1.r4.pgf/RaGene-1_0-st-v1.r4.pgf"
>
>> pkg <- new("AffyGenePDInfoPkgSeed", author="Anne-Marie Madore",
> email="anne-marie.madore.1 at ulaval.ca", version="0.0.1",
>
> + genomebuild="RefSeq April 3, 2007, GenBank® January 25, 2007, Rat Ensembl
> transcripts April 3, 2007 ",
>
> + biocViews="AnnotationData", pgfFile=pgfFile, clfFile=clfFile,
> transFile=transFile, probeFile=probeFile)
>
>> makePdInfoPackage(pkg, destDir=".")
>
> Creating package in ./pd.ragene.1.0.st.v1 
>
> loadUnitsByBatch took 50.51 sec
>
> loadAffyCsv took 12.73 sec
>
> loadAffySeqCsv took 57.62 sec
>
> DB sort, index creation took 24.75 sec
>
> [1] TRUE
>
> Warning messages:
>
> 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
>
> 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
>
>  
>
>  
>
> ## installing the package in cmd command shell
>
>  
>
> Microsoft Windows [version 6.0.6001]
>
> Copyright (c) 2006 Microsoft Corporation. Tous droits réservés.
>
>  
>
> C:\Users\Anne-Marie Madore>cd c:\Program Files\R\R-2.8.1\bin
>
>  
>
> c:\Program Files\R\R-2.8.1\bin>R CMD INSTALL pd.ragene.1.0.st.v1
>
> installing to 'c:/PROGRA~1/R/R-28~1.1/library'
>
>  
>
>  
>
> ---------- Making package pd.ragene.1.0.st.v1 ------------
>
>   adding build stamp to DESCRIPTION
>
>   installing NAMESPACE file and metadata
>
>   installing R files
>
>   installing inst files
>
>   preparing package pd.ragene.1.0.st.v1 for lazy loading
>
> Loading required package: RSQLite
>
> Loading required package: DBI
>
> Loading required package: oligoClasses
>
> Loading required package: Biobase
>
> Loading required package: tools
>
>  
>
> Welcome to Bioconductor
>
>  
>
>   Vignettes contain introductory material. To view, type
>
>   'openVignette()'. To cite Bioconductor, see
>
>   'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>  
>
>   no man files in this package
>
>   installing indices
>
>   installing help
>
>   adding MD5 sums
>
>  
>
> * DONE (pd.ragene.1.0.st.v1)
>
>  
>
>  
>
> ## If I run a check (R CMD check pd.ragene.st.v1) I get three warning
> messages and one note: 
>
>  
>
> 1.       * checking R files for non-ASCII characters ... WARNING 
> Found the following files with non-ASCII characters: all.R Portable packages
> must use only ASCII characters in their R code, except perhaps in comments.
>
> 2.       * checking whether the name space can be loaded with stated
> dependencies ... WARNING
> Error in initDbConnection() : could not find function "dbConnect" Error:
> .onLoad failed in 'loadNamespace' for 'pd.ragene.1.0.st.v1' Execution halted
> A namespace must be able to be loaded with just the base namespace loaded:
> otherwise if the namespace gets loaded by a saved object, the session will
> be unable to start. 
> Probably some imports need to be declared in the NAMESPACE file.
>
> 3.       * checking R code for possible problems ... NOTE
> closeDb: no visible binding for global variable 'dbCon' 
>
> 4.       * checking for missing documentation entries ... WARNING
> Undocumented code objects: 
> pd.ragene.1.0.st.v1 
> All user-level objects in a package should have documentation entries. See
> the chapter 'Writing R documentation files' in manual 'Writing R
> Extensions'.
>
>  
>
>  
>
> ## analyzing the package
>
>  
>
>> library("pd.ragene.1.0.st.v1")
>
>> library(oligo)
>
>> library(limma)
>
>> library(genefilter)
>
> Loading required package: survival
>
>> library(geneplotter)
>
> Loading required package: lattice
>
> Loading required package: annotate
>
> Loading required package: xtable
>
> KernSmooth 2.22 installed
>
> Copyright M. P. Wand 1997
>
>> cel.files <- list.celfiles(".", full.names = TRUE)
>
>> basename(cel.files)
>
>  [1] "AD_Ctrl_1.CEL"    "AD_Ctrl_2.CEL"    "AD_Ctrl_3.CEL"
> "AD_Ctrl_5.CEL"   
>
>  [5] "AD_Ctrl_6.CEL"    "AD_Traite_10.CEL" "AD_Traite_11.CEL"
> "AD_Traite_7.CEL" 
>
>  [9] "AD_Traite_8.CEL"  "AD_Traite_9.CEL" 
>
>> test <- read.celfiles(cel.files)
>
> Platform design info loaded.
>
>> phenoData(test) <- read.AnnotatedDataFrame("phenoData.txt", header = TRUE,
> row.name=1)
>
>> class(test)
>
> [1] "GeneFeatureSet"
>
> attr(,"package")
>
> [1] "oligoClasses"
>
>> class(phenoData)
>
> [1] "standardGeneric"
>
> attr(,"package")
>
> [1] "methods"
>
>> eset <- rma(test)
>
> Background correcting
>
> Normalizing
>
>> e <- exprs(eset)
>
>> index1 <- 1:5
>
>> index2 <- 6:10
>
>> d <- rowMeans(e[, index1]) - rowMeans(e[, index2])
>
>> design <- model.matrix(~factor(eset$Key))
>
>> fit <- lmFit(eset, design)
>
>> ebayes <- eBayes(fit)
>
>> sample <- row.names(ebayes)
>
>> Pvalue <- ebayes$p.value[,2]
>
>> Mean1 <-  rowMeans(e[,index1])
>
>> Mean2 <-  rowMeans(e[,index2])
>
>> sd1 <- apply(e[,index1], 1, "sd")
>
>> sd2 <- apply(e[,index2], 1, "sd")
>
>> csv <- read.csv(file="D:/Anne-Marie/Doctorat/puces ADN macrophages/puces
> rat/Annie
> Dube/Analyse/RaGene-1_0-st-v1.na27.rn4.transcript.header.DOS.csv",head=TRUE,
> sep=",")
>
>> all <- merge(csv, Pvalue, by.x="sample", by.y=0, all=T)
>
>> all2 <- merge(all, Mean1, by.x="sample", by.y=0, all=T)
>
>> all3 <- merge(all2, Mean2, by.x="sample", by.y=0, all=T)
>
>> all4 <- merge(all3, sd1, by.x="sample", by.y=0, all=T)
>
>> all5 <- merge(all4, sd2, by.x="sample", by.y=0, all=T)
>
>> all6 <- merge(all5, d, by.x="sample", by.y=0, all=T)
>
>> write.table(data.frame(all6), file="ULTIMETESTtout.xls", sep="\t",
> col.names = NA)
>
>> dim(all6)
>
> [1] 240410     24
>
>> 
>
>> # I also try using slightly the same way as I used with affy package
>
>> 
>
>> pd <- read.AnnotatedDataFrame("pheno.txt", header = TRUE, row.names = 1)
>
>> pData(pd)
>
>                  Phenotype
>
> AD_Ctrl_1.CEL      Control
>
> AD_Ctrl_2.CEL      Control
>
> AD_Ctrl_3.CEL      Control
>
> AD_Ctrl_5.CEL      Control
>
> AD_Ctrl_6.CEL      Control
>
> AD_Traite_7.CEL    Traited
>
> AD_Traite_8.CEL    Traited
>
> AD_Traite_9.CEL    Traited
>
> AD_Traite_10.CEL   Traited
>
> AD_Traite_11.CEL   Traited
>
>> a <- read.celfiles(filenames = rownames(pData(pd)), phenoData = pd,
> verbose = TRUE)
>
> Platform design info loaded.
>
>> eset <- rma(a)
>
> Background correcting
>
> Normalizing
>
>> exprs.eset <- exprs(eset)
>
>> index1 <- 1:5
>
>> index2 <- 6:10
>
>> d <- (rowMeans(exprs.eset[,index1]) - rowMeans(exprs.eset[,index2]))
>
>> population.groups <- factor (c(rep("Control",5), rep ("Traited",5)))
>
>> design <- model.matrix (~population.groups)
>
>> fit <- lmFit (eset, design)
>
>> fit.eBayes <- eBayes (fit)
>
>> sample <- row.names(fit.eBayes)
>
>> Pvalue <- fit.eBayes$p.value[,2]
>
>> Mean1 <-  rowMeans(e[,index1])
>
>> Mean2 <-  rowMeans(e[,index2])
>
>> sd1 <- apply(e[,index1], 1, "sd")
>
>> sd2 <- apply(e[,index2], 1, "sd")
>
>> csv <- read.csv(file="D:/Anne-Marie/Doctorat/puces ADN macrophages/puces
> rat/Annie
> Dube/Analyse/RaGene-1_0-st-v1.na27.rn4.transcript.header.DOS.csv",head=TRUE,
> sep=",")
>
>> all <- merge(csv, Pvalue, by.x="sample", by.y=0, all=T)
>
>> all2 <- merge(all, Mean1, by.x="sample", by.y=0, all=T)
>
>> all3 <- merge(all2, Mean2, by.x="sample", by.y=0, all=T)
>
>> all4 <- merge(all3, sd1, by.x="sample", by.y=0, all=T)
>
>> all5 <- merge(all4, sd2, by.x="sample", by.y=0, all=T)
>
>> all6 <- merge(all5, d, by.x="sample", by.y=0, all=T)
>
>> write.table(data.frame(all6), file="TESTtout.xls", sep="\t", col.names =
> NA)
>
>> dim(all6)
>
> [1] 240410     24
>
>> sessionInfo()
>
> R version 2.8.1 (2008-12-22) 
>
> i386-pc-mingw32 
>
>  
>
> locale:
>
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
>  
>
> attached base packages:
>
> [1] splines   tools     stats     graphics  grDevices utils     datasets
> methods   base     
>
>  
>
> other attached packages:
>
>  [1] geneplotter_1.20.0        annotate_1.20.1           xtable_1.5-4
>
>
>  [4] lattice_0.17-17           genefilter_1.22.0         survival_2.34-1
>
>
>  [7] limma_2.16.3              pd.ragene.1.0.st.v1_0.0.1 pdInfoBuilder_1.6.0
>
>
> [10] oligo_1.6.0               oligoClasses_1.4.0        AnnotationDbi_1.4.2
>
>
> [13] preprocessCore_1.4.0      affxparser_1.14.2         RSQLite_0.7-1
>
>
> [16] DBI_0.2-4                 Biobase_2.2.1            
>
>  
>
> loaded via a namespace (and not attached):
>
> [1] grid_2.8.1         KernSmooth_2.22-22 RColorBrewer_1.0-2
>
>  
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793




More information about the R-help mailing list