>I wonder if there are any capture-recapture type methodologies for
>estimating open-source software usage?  Another idea would be to
>combine with some other known numbers, e.g. book sales, conference
>attendance etc. You'd need personal information to link the data sets

This totally cracked me up! I'm envisioning going into one of our
computer labs, tossing a net over an unsuspecting student, and then
tagging their ear with a code that represents which stat package they're
using. Then release and later recapture. What percent did we get? That's
what the profs I deal with do with animals to estimate populations.

Conference attendance might be easy to get if I remember to contact the
people running them. Does anyone know how many we expect at UseR 2010? I
recall SAS conferences with 3,500 but data analysis is a tiny part of
that conference. I also heard someone say that they took it to Hawaii
one year to REDUCE the attendance as it had grown so large. Sounds crazy
to me, but if there are attempts to manage the figures, that could muck
up the interpretation. Well, all these approaches have their own
problems, so that's just another "limitation of the study." I think SPSS
Directions has more like 500 but it's all focused on some sort of

I did try to count books at Amazon and papers published via Google
Scholar. Those searches are devilishly difficult for SAS let alone for
letter R!

An easy one to get should be number of list subscribers. I'll try to get
those figures. Anyone know it for R-help?


>PS.  It would be also interesting to see the contributions of the
>R-SIG mailing lists and other specialised R related mailing lists.  My
>feeling is that there is not a lot of overlap between the members of
>the ggplot2 mailing list and R-help.
