[Bioc-devel] assayDat and ExpressionSet
Kasper Daniel Hansen
khansen at stat.Berkeley.EDU
Tue Oct 16 06:14:32 CEST 2007
On Oct 15, 2007, at 7:39 PM, Martin Morgan wrote:
> Vincent Carey 525-2265 <stvjc at channing.harvard.edu> writes:
>
>>> Of course.
>>>
>>> While I am at it: we are dropping phenoData completely right? It
>>> is a
>>> bit confusing that all the new classes have phenoData slots which
>>> are
>>> of class AnnotatedDataFrame. Well, it is mostly confusing because
>>> the
>>> phenoData class is still around.
>>
>> i think the phenoData class should be defunct at next release,
>> but the slots should retain their names.
>>
>>> Shouldn't varLabels(annotatedDataFrame) return object at varMetadata
>>> $labelDescription if the varMetadata slot exists instead of as now
>
> Valid objects always have the slot, with labelDescription always
> present.
Well, the slot can be full of NA's. Example:
R> example("AnnotatedDataFrame-class")
R> obj2 = new("AnnotatedDataFrame", data = obj at data)
R> obj2 at varMetadata
labelDescription
x <NA>
y <NA>
z <NA>
The help page says "varMetadata and dimLabels can be missing."
I agree that in principle the varMetadata slot exists, but it is
essentially empty.
>>> where it just yields names(object at data)
>
> names(pData(object)) ;)
>
>> that seems like a reasonable expectation.
>
> varLabels suggests variable labels rather than label
> descriptions. This was from the time before me, but I'd always thought
> of the use in an interactive context and primarily at the
> ExpressionSet level -- what are the labels of the covariates in this
> analysis? Oh yes, now I'll ExpressionSet$whatever. Since varLabels is
> a generic, and since a variety of different classes in and out of
> Biobase use AnnotatedDataFrame, you'll be changing a lot of output.
I am a bit unsure about what the difference is between variable
labels vs. label descriptions. The historic intention (I dare say)
and use of varLabels was to provide additional information about the
covariates in the pData slot. You might for example have a covariate
named Spres and the varLabel might say "Systolic bloodpressure".
People of course thought about including things such as units of
measurements etc., which can now be incorporated as a separate column
in the varMetadata slot.
varLabels was probably primarily used in show/summary methods and for
many users (I dare say) probably not used very much since they would
say "age is self-describing". I guess one of the intentions was to
make the objects more self-documenting in case you exchange objects
with other people.
Anyway, from the name labelsDescription I got the distinct impression
that the idea was similar. I agree that the AnnotatedDataFrame
example shows otherwise:
> obj at varMetadata
labelDescription
x Numbers
y Factor levels
z Characters
Here it seems as if the labelDescription simply contains a rather non-
useful description of the type of quantity vs. what the variable
represents. On the other hand the "as" conversion from phenoData to
AnnotatedDataFrame simply copies the varLabels from the phenoData
object into the labelDescription in the AnnotatedDataFrame,
suggesting my interpretation: the labelDescription corresponds to the
old-style varLabels.
Looking more closely at the code the new varLabels, varLabels<-
methods makes sure that rownames(object at varMetadata) is equal to
colnames(object at data), a kind of synchronization we never needed with
the old class. Clearly it would be useful to have something doing
this. I would personally have preferred to use "names" for this -
more similarity with data.frame, and then varLabels for accessing the
labelDescription.
> Current use trumps a lot of change. There will be many people who
> rightly expect varLabels to do what it has done for the last several
> years. Better (in my opinion) alternatives are
>
> - labelDescription generic + methods
> - varMetadata(obj)$labelDescription
I am not too sure how widespread the current use is outside of Biobase.
Well this ended being a far too long post about something relatively
obscure, which is only confusing to me because of how it used to be.
Kasper
>> there is a BiobaseDevelopment.Rnw of 4 Sept. 2006(!) in Biobase/
>> inst/doc
>> that seems to be the current locus classicus on eSet and descendants.
>> this should be revisited and made definitive for 2.1 ... i will
>> try to put
>> some effort into this -- after i read it!
>>
>>
>> The information transmitted in this electronic commun...{{dropped:12}}
More information about the Bioc-devel
mailing list