[Rd] a couple of ideas/proposals

Robert Gentleman rgentlem@jimmy.harvard.edu
Wed, 11 Apr 2001 09:29:27 -0400


 Byron Ellis has been making some progress on a hdf5 library for
 microarray data (and any other thing you want to put in there).
 In doing so some issues have arisen that are of more general
 interest.

  1) hdf5 supports annotation (through comments) so it would be nice
  if the comment function in R became generic. I think this is
  backward compatible and basically not really an issue of any kind.

  2) When we start to think of proxy-objects (such as database tables
  and hdf5 arrays) as if they were R objects of a particular type
  things become rather interesting. 
     I'm starting to think that the right abstraction for hdf5 is as
  arrays (not dataframes) since the slabs are homogeneous. For us they
  are just ints or floats but they can be arbitrary so maybe the
  abstraction really is dataframe.
     In any event, the interesting question is about arraySubscript.
  (and I guess other similar functionality).
     Currently this gets (int dim, SEXP s, SEXP x)
   where s is the subscript argument and x is the object itself. All
  we do with x is to ask it for its dim attribute and potentially its
  DimNames attribute.
     We would like to use arraySubscript to generate the subscripts
  for [.hdf5.dataset. But since these guys aren't really R arrays this
  is slightly problematic. It might be better if we either change
  arraySubscript to accept the dims and dimnames but not x (so x can
  really be anything) or add a new entry point that does accept only
  these. I think the latter is the correct route - (do we have a way
  of deprecating R internals).
    I think that in general it would be nice to think of as much of
  the internal functionality moving towards supporting "non-internal"
  versions of some data structures. Especially arrays and dataframes
  since there seems to be some advantage to having external versions
  of them.

    Another thing that Byron mentioned is that he is using finalizers
    quite a lot but that R doesn't do a gc on exit and so these don't
    always get called (I haven't checked) but I think that might
    become necessary.
-- 
+---------------------------------------------------------------------------+
| Robert Gentleman                 phone : (617) 632-5250                   |
| Associate Professor              fax:   (617)  632-2444                   |
| Department of Biostatistics      office: M1B28
| Harvard School of Public Health  email: rgentlem@jimmy.dfci.harvard.edu   |
+---------------------------------------------------------------------------+
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._