[R] Popularity of R, SAS, SPSS, Stata...

(Ted Harding) Ted.Harding at manchester.ac.uk
Sun Jun 20 21:41:46 CEST 2010


On 20-Jun-10 19:07:21, Muenchen, Robert A (Bob) wrote:
>>I wonder if there are any capture-recapture type methodologies for
>>estimating open-source software usage?  Another idea would be to
>>combine with some other known numbers, e.g. book sales, conference
>>attendance etc. You'd need personal information to link the data sets
>>together.
>>
>>Hadley
> 
> This totally cracked me up! I'm envisioning going into one of our
> computer labs, tossing a net over an unsuspecting student, and then
> tagging their ear with a code that represents which stat package
> they're using. Then release and later recapture. What percent did
> we get? That's what the profs I deal with do with animals to estimate
> populations.

I've given thought in the past to the question of estimating the R
user base, and came to the conclusion that it is impossible to get
an estimate of the number of users that one could trust (or even
put anything like a margin of error to).

I think one could get a number which represented a moderately
informative lower bound -- just count the number of different email
addresses that have ever posted to the R-help list. This will of
course include people who post (or have posted) from more than one
email address, and people who tried R for a while and then dropped
it, but my feeling is that these are likely to be outweighed by the
number of people who have used R but have never posted (for example
students who are getting their R help from their instructors, people
using R in a corporate context who are discouraged from posting to
public lists, etc.).

The number of subscribers to R-help (currently about 10200) is
a definite lower bound for the number of R users, but many users
post to R-help without being subscribed.

I would expect that the total number of different email addresses
that have posted to R-help would be considerably larger than 10200.

I don't think a "Mark-Recapture" approach is feasible.

Further, I don't know how one might take account of the fact that
some installations of R (e.g. on a corporate or institutional
or departmental server) may each be used by several users.

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 20-Jun-10                                       Time: 20:41:43
------------------------------ XFMail ------------------------------



More information about the R-help mailing list