[Bioc-devel] Can't set featureNames on naive ExpressionSet object

Sat Jun 28 05:36:50 CEST 2008

Gordon K Smyth <smyth at wehi.EDU.AU> writes:

> Why can't I set featureNames?  

Try

narrays <- 4
nprobes <- 100
exprs <- matrix(rnorm(nprobes*narrays),nprobes,narrays,
                dimnames=list(
                  paste("Gene", 1:nprobes, sep=""),
                  NULL))
eset <- new("ExpressionSet", exprs=exprs)

or

eset <- new("ExpressionSet",
            exprs=matrix(rnorm(nprobes*narrays),nprobes,narrays))
featureNames(eset) <- paste("Gene",1:nprobes,sep="")

or the solution below. The recommended pattern for eSet-like objects
is to construct the components and then assemble into the complete
object with a call to 'new'.

'why' is because exprs<-,ExpressionSet,matrix-method creates an
invalid object

> eset <- new("ExpressionSet")
> exprs(eset) <- matrix(rnorm(nprobes*narrays),nprobes,narrays)
> validObject(eset)
Error in validObject(eset) : 
  invalid class "ExpressionSet" object: 1: feature numbers differ between assayData and featureData
invalid class "ExpressionSet" object: 2: sample numbers differ between assayData and phenoData

that featureNames<- then correctly objects to.  The thought was that
transiently invalid objects like this were sometimes necessary in the
S4 world, e.g., as here, where the exprs, phenoData and featureData
slots all need to be updated for a correct object:

narrays <- 4
nprobes <- 100
eset <- new("ExpressionSet")

exprs(eset) <- matrix(rnorm(nprobes*narrays),nprobes,narrays)

fData <- data.frame(row.names=paste("Gene", 1:nprobes, sep=""))
featureData(eset) <- new("AnnotatedDataFrame", data=fData)

pData <- data.frame(row.names=paste(letters[1:narrays]))
phenoData(eset) <- new("AnnotatedDataFrame", data=pData)

Obviously too much of the class structure and implementation detail
are being exposed to the user.

> Why is sampleNames() called internally when the user has not asked
> for it?

because featureNames<- is trying to update featureData (an
AnnotatedDataFrame) using
sampleNames<-,AnnotatedDataFrame,character-method to make a consistent
object (AnnotatedDataFrame has both featureNames<- and sampleNames<-
for both historical and not entirely implausible user interface
reasons).

Martin

> Gordon
>
>> narrays <- 4
>> nprobes <- 100
>> eset <- new("ExpressionSet")
>> exprs(eset) <- matrix(rnorm(nprobes*narrays),nprobes,narrays)
>> featureNames(eset) <- paste("Gene",1:nprobes,sep="")
> Error in `sampleNames<-`(`*tmp*`, value = c("Gene1", "Gene2", "Gene3",  :
>    number of new names (100) should equal number of rows in
>    AnnotatedDataFrame (0)
>
>
>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets
> methods base
>
> other attached packages:
> [1] Biobase_2.0.0
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793