[R] popular R packages
Emmanuel Charpentier
charpent at bacbuc.dyndns.org
Sun Mar 8 23:45:23 CET 2009
Le dimanche 08 mars 2009 à 13:22 -0500, Dirk Eddelbuettel a écrit :
> On 8 March 2009 at 13:27, Duncan Murdoch wrote:
> | But we don't even have that data, since CRAN is distributed across lots
> | of mirrors.
>
> On 8 March 2009 at 19:01, Emmanuel Charpentier wrote:
> | As far as I can see (but I might be nearsighted), I see no model linking
> | package download to package use(s). Data may or may not become available
>
> Which is why Debian (and Ubuntu) use the _opt-in package_ popularity-contest
> that collects data on packages used and submits that to a host collecting the
> data. This drives the so-called 'popcon' statistics.
>
> Yes, and there are many ways in which one can criticise this data collection
> process. But I fail to see how __not having any data__ leads to more
> informed decisions.
>
> Once you have data, you have an option of using or discarding it. But if you
> have no data, you have no option. How is that better?
I question 1) the usefulness of the effort necessary to get the data ;
and 2) the very concept of data mining, which seems to be the rationale
for this proposed effort.
Furthermore (but this is seriously off-topic), I seriously despise the
very idea of "popularity" in scientific debates... "Everybody does it"
is *not* a valid argument. Nor "Everyone knows...".
Emmanuel Charpentier
More information about the R-help
mailing list