[R] reliability of R-Forge?

Marc Schwartz marc_schwartz at me.com
Thu Aug 26 18:59:48 CEST 2010


On Aug 26, 2010, at 11:01 AM, Dirk Eddelbuettel wrote:

> 
> On 26 August 2010 at 11:28, R P Herrold wrote:
> | On Thu, 26 Aug 2010, Gavin Simpson wrote:
> | 
> | > On Thu, 2010-08-26 at 02:30 -0400, David Kane wrote:
> | >> How reliable is R-Forge? http://r-forge.r-project.org/
> | >>
> | >> It is down now (for me). Reporting "R-Forge Could Not Connect to Database: "
> | 
> | late to chime in, so had tossed the first piece.  As this 
> | relates to 'reliability of R-Forge' in the sense of possible 
> | process issues, rather than availability of the archive, I 
> | wanted to 'tag into' this thread
> | 
> | I 'mirror' r-forge, so I have not seen this ...
> | 
> | One thing I note, mirroring r-forge, and processing 'diffs' 
> | netween successive days, is that the md5sums of some packages 
> | regularly change without version number bumps.  From this 
> | morning's report in my email:
> | 
> | Thu Aug 26 04:30:01 EDT 2010
> | 
> | --- /tmp/rforge-pre.txt 2010-08-26 04:30:33.000000000 -0400
> | +++ /tmp/rforge-post.txt        2010-08-26 04:38:03.000000000 
> | -0400
> | @@ -8,18 +8,18 @@
> |   AquaEnv_1.0-1.tar.gz   615059a5369d1aba149e6142fedffdde
> |   ArvoRe_0.1.6.tar.gz    c955ae7c64c4270740172ad2219060ff
> |   BB_2010.7-1.tar.gz     4f85093ab24fac5c0b91539ec6efb8b7
> | -BCE_2.0.tar.gz 5a3fe3ecabbe2b2e278f6a48fc19d18d
> | -BIOMOD_1.1-5.tar.gz    d2f74f21bc8858844f8d71627fd8e687
> | +BCE_2.0.tar.gz 65a968c586e729a1c1ca34a37f5c293a
> | +BIOMOD_1.1-5.tar.gz    6929e5ad6a14709de7065286ec684942
> |   ...
> | -BTSPAS_2010.08.tar.gz  16b8f265846a512c329f0b52ba1924ab
> | +BTSPAS_2010.08.tar.gz  809a96b11f1094e95b217af113abd0ac
> |   ...
> | -BayesR_0.1-1.tar.gz    72bd41c90845032eb9d15c4c6d086dec
> | +BayesFactorPCL_0.5.tar.gz      173ab741c399309314eff240a4c3cd6f
> | +BayesR_0.1-1.tar.gz    9560b511f1b955a60529599672d58fea
> |   ...
> | -BiplotGUI_0.0-6.tar.gz 594b3a275cde018eaa74e1ef974dd522
> | +BiplotGUI_0.0-6.tar.gz 857a484fdba6cb97be4e42e38bb6d0fd
> |   ...
> | -IsoGene_1.0-18.tar.gz  679a5aecb7182474ed6a870fa52ca2e3
> | +IsoGene_1.0-18.tar.gz  f37572957b2a9846a8d738ec88ac8690
> | 
> | and so forth.  I've not taken the trime to understand why 
> | seemingly new versions are appearing without version bumps 
> | yet.
> | 
> | Is anyone aware of explanations, other than a release process 
> | that does not require unique versioning of differing content? 
> | [it seems pretty basic to me that a 'receiver' of new content 
> | could do the checks I do, and decline to push conflicting 
> | md5sums over an identically named prior candidate in archive]
> 
> Version numbers change only when DESCRIPTION/Version gets updated.
> 
> Content (of the tarball) and thusly md5sum changes whenever _any_ file
> in the archive changes.
> 
> Methinks you tricked yourself into assuming tarballs have to be constant
> because they are on CRAN _where changes happen only with new releases_.  
> 
> Dirk


I might also point out that the same process is in place for the daily tarballs of R itself, available via:

  ftp://ftp.stat.math.ethz.ch/Software/R/

The R version does not change with each new daily tarball, but the svn rev number will change with each new commit. Thus, each day, the checksum will be different, since the tarball is generated each day with the new svn commits from the prior 24 hours, unless of course, there have been no new commits in that time frame.

In the case of the R tarballs, the generating script also includes a file called SVN-REVISION, which will contain something like:

Revision: 52804
Last Changed Date: 2010-08-25


You don't get that with the R-Forge tarballs, since that file/info is not included in R source package tarballs.

The same would occur BTW, if you did an svn checkout rather than using the tarball directly. Your checkout would reflect the current rev state of the svn repo at that time, which will be more granular than a daily source tarball and might be different 5 minutes later, if a new commit occurred in that time frame.

This approach is part and parcel of using an svn repo. You can have main trunks (in the case of R, R-Devel), along with defined branches (eg. R 2.x.y), each of which may get hundreds or thousands of commits over some window of time, without a version bump during that time frame. R-Forge packages will not have the same frequency of commits, but the same approach can be applied.

So you need to differentiate between the ongoing development/commit process and the versioned release process.

HTH,

Marc Schwartz



More information about the R-help mailing list