[Bioc-devel] assayDat and ExpressionSet

Kasper Daniel Hansen khansen at stat.Berkeley.EDU
Tue Oct 16 06:14:32 CEST 2007


On Oct 15, 2007, at 7:39 PM, Martin Morgan wrote:

> Vincent Carey 525-2265 <stvjc at channing.harvard.edu> writes:
>
>>> Of course.
>>>
>>> While I am at it: we are dropping phenoData completely right? It  
>>> is a
>>> bit confusing that all the new classes have phenoData slots which  
>>> are
>>> of class AnnotatedDataFrame. Well, it is mostly confusing because  
>>> the
>>> phenoData class is still around.
>>
>> i think the phenoData class should be defunct at next release,
>> but the slots should retain their names.
>>
>>> Shouldn't varLabels(annotatedDataFrame) return object at varMetadata
>>> $labelDescription if the varMetadata slot exists instead of as now
>
> Valid objects always have the slot, with labelDescription always
> present.

Well, the slot can be full of NA's. Example:

   R> example("AnnotatedDataFrame-class")
   R> obj2 = new("AnnotatedDataFrame", data = obj at data)
   R> obj2 at varMetadata
   labelDescription
x             <NA>
y             <NA>
z             <NA>

The help page says "varMetadata and dimLabels can be missing."

I agree that in principle the varMetadata slot exists, but it is  
essentially empty.

>>> where it just yields names(object at data)
>
> names(pData(object)) ;)
>
>> that seems like a reasonable expectation.
>
> varLabels suggests variable labels rather than label
> descriptions. This was from the time before me, but I'd always thought
> of the use in an interactive context and primarily at the
> ExpressionSet level -- what are the labels of the covariates in this
> analysis? Oh yes, now I'll ExpressionSet$whatever. Since varLabels is
> a generic, and since a variety of different classes in and out of
> Biobase use AnnotatedDataFrame, you'll be changing a lot of output.

I am a bit unsure about what the difference is between variable  
labels vs. label descriptions. The historic intention (I dare say)  
and use of varLabels was to provide additional information about the  
covariates in the pData slot. You might for example have a covariate  
named Spres and the varLabel might say "Systolic bloodpressure".  
People of course thought about including things such as units of  
measurements etc., which can now be incorporated as a separate column  
in the varMetadata slot.

varLabels was probably primarily used in show/summary methods and for  
many users (I dare say) probably not used very much since they would  
say "age is self-describing". I guess one of the intentions was to  
make the objects more self-documenting in case you exchange objects  
with other people.

Anyway, from the name labelsDescription I got the distinct impression  
that the idea was similar. I agree that the AnnotatedDataFrame  
example shows otherwise:
 > obj at varMetadata
   labelDescription
x          Numbers
y    Factor levels
z       Characters

Here it seems as if the labelDescription simply contains a rather non- 
useful description of the type of quantity vs. what the variable  
represents. On the other hand the "as" conversion from phenoData to  
AnnotatedDataFrame simply copies the varLabels from the phenoData  
object into the labelDescription in the AnnotatedDataFrame,  
suggesting my interpretation: the labelDescription corresponds to the  
old-style varLabels.

Looking more closely at the code the new varLabels, varLabels<-  
methods makes sure that rownames(object at varMetadata) is equal to  
colnames(object at data), a kind of synchronization we never needed with  
the old class. Clearly it would be useful to have something doing  
this. I would personally have preferred to use "names" for this -  
more similarity with data.frame, and then varLabels for accessing the  
labelDescription.

> Current use trumps a lot of change. There will be many people who
> rightly expect varLabels to do what it has done for the last several
> years. Better (in my opinion) alternatives are
>
> - labelDescription generic + methods
> - varMetadata(obj)$labelDescription

I am not too sure how widespread the current use is outside of Biobase.

Well this ended being a far too long post about something relatively  
obscure, which is only confusing to me because of how it used to be.

Kasper


>> there is a BiobaseDevelopment.Rnw of 4 Sept. 2006(!) in Biobase/ 
>> inst/doc
>> that seems to be the current locus classicus on eSet and descendants.
>> this should be revisited and made definitive for 2.1 ... i will  
>> try to put
>> some effort into this -- after i read it!
>>
>>
>> The information transmitted in this electronic commun...{{dropped:12}}



More information about the Bioc-devel mailing list