[R] Need Advice: Considering Converting a Package from S3 to S4

James W. MacDonald jmacdon at med.umich.edu
Mon Aug 10 19:17:59 CEST 2009



Bryan Hanson wrote:
> Hello R Folks...  
> 
> Not a technical question, but I need some advice and perspective.
> 
> I¹ve got a set of functions I¹m planning to put together into a package.
> The main hunk of data that gets used by different functions is currently an
> S3 list.  I¹ve been reading about S4 objects, and I see the (numerous)
> advantages of them.  I have seen the recommendation that all new packages be
> done with S4.  Before I get much farther, I need to decide if I will go to
> S4 for this central hunk of data.
> 
> My questions are about making the conversion, whether it is worth the
> trouble and what pitfalls I might encounter.  I can easily (re)define my key
> list as an S4 object.  But after that...
> 
> 1.  It seems the the simplest/minimalist approach is to update all the
> functions so that where I use ³data$element² I replace it with ³data at slot².
> Is it really this easy, or have I missed something?  Easy or not, this by
> itself doesn't take advantage of much, except the ability to define
> subclasses at a later date (maybe that is sufficient reason though).
> 
> 2.  I also see in my reading that I should consider writing accessor
> functions for my object.  What I can't quite see is why I would want to do
> this, if I can get the contents with "data at slot"?  What am I missing here?

That recommendation is directed towards user-level functions, rather 
than the functions themselves. The idea being that you might want to 
change the internal representation of your data but you would want the 
API for the end user to remain the same.

> 
> 3. At this point, I'm not sure that I would write specific methods for this
> proposed S4 object.  It would not be necessary in the short run.  Making it
> S4 would mainly allow for "future expansion" as they say.  If methods are
> not critical, does it make sense to spend the time making the change?

I think it depends on the purpose of the data and whether or not you 
envision having more data of the same type. The Bioconductor project has 
made extensive use of S4 data containers to encapsulate all manner of 
high-throughput data. Since the end user is often the one creating the 
data object, this has allowed us to add lots of validity checks to 
ensure that the object is what is expected by the downstream functions.

If your package simply contains some static data and some functions to 
operate on those data, using S4 might be overkill.

Best,

Jim


> 
> Any perspective and advice would be welcomed.  Thanks in advance, Bryan
> *************
> Bryan Hanson
> Professor of Chemistry & Biochemistry
> DePauw University, Greencastle IN USA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826




More information about the R-help mailing list