[Rd] Suggestion: Dimension-sensitive attributes
Heinz Tuechler
tuechler at gmx.at
Thu Jul 9 11:48:20 CEST 2009
At 11:14 09.07.2009, SIES 73 wrote:
> > If "objattr", "dimattr" and "cellattr" are
> lists, they would offer save places for all
> attributes that should be kept on subsetting.
>
>My proposed design would be that:
>
> * "objattr" would be a list of
> attributes (just preserved on subsetting)
> * "dimattr" would be a list with as
> many elements as array dimensions. Each element
> can be any object whose length matches the
> corresponding array dimension's length and that
> can be itself subsetted with "[": so it could
> be a vector, a list, a data frame...
> * "cellattr" would be any object whose
> dimensions match the array dimensions: another array, a data frame...
>
> > In my view this would be very useful, because
> that way a general solution for data
> description, like variabel names, variable labels, units, ... could be reached.
>
>Indeed, that's the objective: attaching
>user-defined metadata that is automatically
>synchronized with subsetting operations to the actual data.
>
>I've had dozens of use cases on my own R
>programs that needed this type of pattern, and
>seen it implemented in different ways in several
>classes (xts, timeSeries, AnnotatedDataFrame,
>etc.) As you point, this could offer a unified design for a common need.
>
>Enrique
For my personal use it was sufficient to create a
class called "documented" with a corresponding
subsetting method and one attribute, also called
"documented". This attribute may contain
'varlabel', 'varname', 'value.labels',
'missing.values', 'code.ordered', 'comment', ...
It is copied on subsetting.
I think attributes concerning e.g. dimensions,
i.e. parts of an object should stay in this
object-related attribute and be extracted on
subsetting. Since subsetting an object leads to a
new object, this could then have its own, new persisting attribute.
The more difficult part may to be the binding of objects.
Heinz
>-----Original Message-----
>From: Heinz Tuechler [mailto:tuechler at gmx.at]
>Sent: jueves, 09 de julio de 2009 10:56
>To: Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; r-devel at r-project.org
>Cc: Henrik Bengtsson
>Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes
>
>At 10:01 09.07.2009, SIES 73 wrote:
> >I've also had several use cases where I needed "cell-like" attributes,
> >that is, attributes that have the same dimensions as the original array
> >and are subsetted in the same way --along all its dimensions.
> >
> >So we're talking about a way to add metadata to matrices/arrays at 3
> >possible levels:
> >
> > 1) at the "whole object" level:
> > attributes that are not dropped on subsetting
> > 2) at the "dimension" level: attributes that behave like
> > "dimnames", i.e. subsetted along each dimension
> > 3) at the "cell" level: attributes that are subsetted in the
> > same way as the original array
> >
> >My proposal would be simpler that Tony's
> >suggestion: like "dimnames", just have reserved attribute names for
> >each case, say "objdata", "dimdata", and "celldata" (or "objattr",
> >"dimattr" and "cellattr").
>
>If "objattr", "dimattr" and "cellattr" are
>lists, they would offer save places for all
>attributes that should be kept on subsetting. In
>my view this would be very useful, because that
>way a general solution for data description,
>like variabel names, variable labels, units, ... could be reached.
>
>
> >On the other hand, Tony's pattern would allow as many attributes of
> >each type as necessary (some multiplicity is already possible with the
> >simpler design as dimdata or celldata could be lists of lists), at the
> >cost of a more complex scheme of attributes that needs to be "parsed"
> >each time.
> >
> >On Tony's suggestion, "attr.keep.on.subset" and "attr.dimname.like"
> >(and possible
> >"attr.cell.like") could be kept on a single list with 3 elements,
> >something like:
> >
> > > attr(x, "attr.subset.with") <- list(object=..., dims=..., cells=...)
> >
> >Would something like this make sense for R-core --either for standard
> >arrays or as a new class-- or would it be better implemented in a
> >package?
> >
> >Enrique
> >
More information about the R-devel
mailing list