[R] popular R packages

Mon Mar 9 00:02:48 CET 2009

2009/3/8 Emmanuel Charpentier <charpent at bacbuc.dyndns.org>:

> I question 1) the usefulness of the effort necessary to get the data ;
> and 2) the very concept of data mining, which seems to be the rationale
> for this proposed effort.
>
> Furthermore (but this is seriously off-topic), I seriously despise the
> very idea of "popularity" in scientific debates... "Everybody does it"
> is *not* a valid argument. Nor "Everyone knows...".

 As long as we agree that pacakge downloads != popularity then we have
useful data.

 Usefulness of the data? Let's think...

 Suppose we discover that spatstat is downloaded 100 times more than
splancs is. Both packages compute K-functions of spatial data. Pretend
there's an enhancement to K-function computation that could be
implemented in spatstat and/or splancs. Why bother doing it in
splancs?

 Currently the only usage stats we have are even worse measures such
as number of mentions in R-help or number of bug reports. Or maybe
citation counts, but who would make important decisions based on
those?

 I'd love to go 'Hmmm how many people are using my package?' and get
an exact answer. Given the impossibility of that information, I'd love
to go 'Hmmm how many people downloaded my package?', a good
approximation to which is not beyond the bounds of our technology. Web
pages have had annoying 'this piece of software has been downloaded
443535 times' banners (often enclosed in <blink> tags) since 1996.Yes
it would require some effort at each CRAN site, but maybe the CRAN
mirror site maintainers might be interested in doing this. If they
don't want to, then fine.

Barry