[R] popular R packages

Emmanuel Charpentier charpent at bacbuc.dyndns.org
Sun Mar 8 23:45:23 CET 2009


Le dimanche 08 mars 2009 à 13:22 -0500, Dirk Eddelbuettel a écrit :
> On 8 March 2009 at 13:27, Duncan Murdoch wrote:
> | But we don't even have that data, since CRAN is distributed across lots 
> | of mirrors.
> 
> On 8 March 2009 at 19:01, Emmanuel Charpentier wrote:
> | As far as I can see (but I might be nearsighted), I see no model linking
> | package download to package use(s). Data may or may not become available
> 
> Which is why Debian (and Ubuntu) use the _opt-in package_ popularity-contest
> that collects data on packages used and submits that to a host collecting the
> data.  This drives the so-called 'popcon' statistics.
> 
> Yes, and there are many ways in which one can criticise this data collection
> process.   But I fail to see how __not having any data__ leads to more
> informed decisions.
> 
> Once you have data, you have an option of using or discarding it. But if you
> have no data, you have no option.  How is that better?

I question 1) the usefulness of the effort necessary to get the data ;
and 2) the very concept of data mining, which seems to be the rationale
for this proposed effort.

Furthermore (but this is seriously off-topic), I seriously despise the
very idea of "popularity" in scientific debates... "Everybody does it"
is *not* a valid argument. Nor "Everyone knows...".

					Emmanuel Charpentier




More information about the R-help mailing list