[R] Which CRAN mirror is the fastest ?
Martin Maechler
maechler at stat.math.ethz.ch
Thu Jul 30 11:49:54 CEST 2009
>>>>> Barry Rowlingson <b.rowlingson at lancaster.ac.uk>
>>>>> on Thu, 30 Jul 2009 09:59:47 +0100 writes:
> 2009/7/30 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
>> Hard to lee, you have to try out, I fear.
>>
>> The speed you see highly depends on the connection from your country to
>> others, but of course, there are also some mirrors that are not the fastest
>> themselves.
> I figured you could write a function that got the CRAN mirror list and
> tested their response. Here's my 'cranometer':
> cranometer <- function(ms = getCRANmirrors(all = FALSE, local.only = FALSE)){
> dest = tempfile()
> nms = dim(ms)[1]
> ms$t = rep(NA,nms)
> for(i in 1:nms){
> m = ms[i,]
> url = paste(m$URL,"/src/base/NEWS",sep="")
> t = try(system.time(download.file(url,dest),gcFirst=TRUE))
> if(file.exists(dest)){
> file.remove(dest)
> ms$t[i]=t['elapsed']
> }else{
> ms$t[i]=NA
> }
> }
> return(ms)
> }
> It works by downloading the latest NEWS file (376Kbytes at the
> moment, so not huge) from each of the mirror sites in the CRAN mirrors
> list. If you want to test it on a subset then call getCRANmirrors
> yourself and subset it somehow.
> I'm running it now on the full CRAN list and I've yet to find a
> timeout or error so I'm not sure what will happen if download.file
> fails. It retuns a data frame like you get from getCRANmirrors but
> with an extra 't' column giving the elapsed time to get the NEWS file.
> CAVEATS: if your network has any local caching then these results
> will be wrong, since your computer will probably be getting the
> locally cached NEWS file and not the one on the server. Especially if
> you run it twice. Oh, I should have put cacheOK=FALSE in the
> download.file - but even that might get overruled somewhere. Also,
> sites may have good days and bad days, good minutes and bad minutes,
> your network may be congested on a short-term basis, etc etc.
> Other ideas: how about combining the CRAN list with my geonames
> package to work out distances from where you are to the CRAN site? I
> might write that later if I get a minute...
Yes! And visualize the corresponding "nearest neigbourhood"
for each CRAN mirror on a world map
and make this dynamically refreshing every few minutes
and put it on a webserver so people can watch the "CRAN world"
in real time!
More seriously, it would be really cool if a "robust" version of
cranometer() could be used automagically in the (typical /
default) case of install.packages() {and it's call from the
Windows (or also Mac?) 'Packages' menu} when the user / site
have no CRAN repository specified:
It would choose the CRAN mirror which is closest,
or even better (and more appropriate for a statistics software),
would chose one at random, but with probability inversely
proportional to (a power of ?) the "distance".
... yes, we should defer this from R-help to R-devel ..
Martin
More information about the R-help
mailing list