[Bioc-devel] Proposal: Generic array annotation via dimattr() or similar (Was: Re: BioC 2.5: Added scanDates slot to Biobase's eSetclass)

Mon Jun 29 21:55:46 CEST 2009

[Feel free to move this to r-devel, which is more suited for the topic]

Hi,

it's good hear that I'm not the only one thinking in these terms.  It
would be great to get some prototypes of this going.  In order to
avoid having to modify the R core, an implementation via (multiple)
inheritance could solve it.  It sounds like Enrique might already have
something towards this.

I won't have much time for designing this from scratch, but I am happy
to give feedback and be a tester of it.

Cheers,

Henrik

On Fri, Jun 26, 2009 at 5:39 AM, Laurent Gautier<laurent at cbs.dtu.dk> wrote:
> [I almost missed that one]
>
> Generalization seems like a good idea.
>
> In the present context (eSet), that would point in the direction of dropping
> semantic considerations (the phenoData / arrayData earlier in this thread)
> since what we have is just "columnData" (and featureData are "rowData").
>
> I'd be willing to help out putting a prototype to see how this could work
> (may be constraining things to one data.frame per dimension, rather than
> free lists as in you example)
>
>
> L.
>

[snip]

On Thu, Jun 25, 2009 at 2:05 AM, Bengoechea Bartolomé Enrique (SIES
73)<enrique.bengoechea at credit-suisse.com> wrote:
> Hi Henrik,
>
> Although I work in a different field (finance), I completely agree that your proposal would be one of the most useful infrastructure additions to R. I saw your post on R-devel and completely aggreed with the suggestion (sad that nobody from R-base answered). It would maybe have important performance implications for standard arrays, so maybe a new class could be envisioned for it? I would volunteer to work in such a project, as I've been writting a lot of R code like that in the past.
>
> For example, I've developed a class similar to AnnotatedDataFrame but based on S3 and with some pre-defined meta-information (prefix the type of each variable, user-defined coercion behaviour, allow/disallow NAs per variable, etc.) which will be soon submitted to CRAN, and experienced first-hand the need for this type of infrastructure.
>
> Best,
>
> Enrique
>

[snip]

>
> Henrik Bengtsson wrote:
>>
>> To generalize this above the level of biology etc, some annotation
>> data maps naturally *along the dimension* of arrays, not to each cell,
>> e.g. rownames() and colnames() of a matrix.  One timestamp per array
>> is such an attribute.  On June 7, 2009 I sent the message '[Rd]
>> Suggestion: Dimension-sensitive attributes' to R-devel;
>>
>>  http://tolstoy.newcastle.edu.au/R/e6/devel/09/06/2043.html
>>
>> to suggest to have a generic dimattr(<obj>, <name>) "getter" and
>> "setter".  See example in that message.  Maybe such a design pattern
>> helps here too?
>>
>> Also, the MAGE people should already have spent a lot of time thinking
>> and designing this kind of stuff.  Maybe something there?
>>
>> /Henrik
>>

[snip]