[R] popular R packages

Spencer Graves spencer.graves at prodsyse.com
Sun Mar 8 17:47:08 CET 2009


      Is this another discussion of what data might be collected and 
analyzed, and what could and could not be said if we only had such data? 

      Has anyone but me produced any actual data?  If so, I missed it.  
Hadly mentioned the 'fortunes' package.  My earlier methodology, 
"RSiteSearch('library(fortunes)')", produced 40 hits for 'fortunes', 
compared to 169 for 'lme4' and 2 for 'DierckxSpline'. 

      With anything like this, it would be wise to approach the problem 
from many different perspectives, recognizing that the strengths of one 
approach can help improve our understanding of what other analyses say 
about the question at hand. 

      Happy Sunday. 
      Spencer Graves    

(Ted Harding) wrote:
> On 08-Mar-09 15:14:03, Duncan Murdoch wrote:
>   
>> On 08/03/2009 10:49 AM, hadley wickham wrote:
>>     
>>>> More seriously : I don't think relative numbers of package downloads
>>>> can be interpreted in any reasonable way, because reasons for
>>>> package download have a very wide range from curiosity ("what's
>>>> this ?"), fun (think "fortunes"...), to vital need tthink lme4
>>>> if/when a consensus on denominator DFs can be reached :-)...).
>>>> What can you infer in good faith from such a mess ?
>>>>         
>>> So when we have messy data with measurement error, we should just
>>> give up?  Doesn't sound very statistical! ;)
>>>       
>> I think the situation is worse than messy.  If a client comes in with 
>> data that doesn't address the question they're interested in, I think 
>> they are better served to be told that, than to be given an answer that
>> is not actually valid.  They should also be told how to design a study 
>> that actually does address their question.
>>
>> You (and others) have mentioned Google Analytics as a possible way to 
>> address the quality of data; that's helpful.  But analyzing bad data 
>> will just give bad conclusions.
>> Duncan Murdoch
>>     
>
> The population of R users (which we would need to sample in order
> to obtain good data) is probably more elusive than a fish population
> in the ocean -- only partially visible at best, and with an unknown
> proportion invisible.
>
> At least in Fisheries research, there are long established capture
> techniques (from trawling to netting to electro-fishing to ... )
> which can be deployed, for research purposes, in such a way as to
> potentially reach all members of a target population, with at least
> a moderately good approximation to random sampling. What have we
> for R?
>
> Come to think of it, electro-fishing, ...
>
> Suppose R were released with 2 types of cookie embedded in base R.
> Each type is randomly configured, when R is first run, to be Active
> or Inactive (probability of activation to be decided at the design
> stage ... ). Type 1, if active, on a certain date generates an
> event which brings it to the notice of R-Core (e.g. by clandestine
> email or by inducing a bug report). Type 2 acts similarly on a later
> date. If Type 2 acts, it carries with it information as to whether
> there was a Type 1 action along with whether, apparently, the Type 1
> action "succeeded".
>
> We then have, in effect, an analogue of the Mark-Recapture technique
> of population estimation (along with the usual questions about
> equal catchability and so forth).
>
> However, since this sort of thing (which I am not proposing seriously,
> only for the sake of argument) is undoubtedly unethical (and would
> do R's reputation no good if it came to light), I tentatively conclude
> that the population of R users is likely to remain as elusive as ever.
>
> Best wishes to all,
> Ted.
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 08-Mar-09                                       Time: 16:11:44
> ------------------------------ XFMail ------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list