[Rd] CRAN Server download statistics (Was: R Usage Statistics)
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Nov 23 15:51:11 CET 2009
On Mon, Nov 23, 2009 at 9:48 AM, hadley wickham <h.wickham at gmail.com> wrote:
>> Knowing what percentage of different OSes are being used is of
>> interest to package developers and would be obscured by the proposal
>> to massage the data. I prefer to see the raw figure as is.
>
> I agree. I was arguing that sorting by that value wasn't very useful.
>
>> Also the number of IPs are important and should not be removed in my
>> opinion since (1) it is a measure of clustering. If a package is
>> mainly used by the courses of a few universities where the students
>> really have no choice then that seems a lot different than if its used
>> by a variety of people around the world. Only the IPs would give any
>> clue to that. (2) it helps to diagnose intentional distortion of the
>> figures by repeat downloads to the same machine.
>
> There is no way to tease apart (1) and (2), plus many adsl providers
> share an ip across multiple subscribers. Number of unique IPs may
> still be useful, but it needs to be used with caution.
>
>> The one problem with sparkline graphs is that it would take a lot
>> longer for the page to load. There already is a time series if you
>> click on the package name.
>
> Is it a time series? It looks like a bar chart of downloads per day
> of week to me.
>
A time series is a function of time regardless of representation.
More information about the R-devel
mailing list