[R] Justifying R to anti open-source management

Michael Grant mwgrant2001 at yahoo.com
Wed May 17 19:52:50 CEST 2006


Hello Peter,

I am working on a related problem--getting R acceptable within division and
project QA. Unfortunately, it seems to keep getting put on the back-burner as I
address time sensitive needs. I did some googling and made a few phone calls.
I'm expect that there is much more to be found but below is an US agency
oriented compilation of what I got in my brief search. It seems I ran into a
number of USDOE (National Labs HPC stuff) reports but I seem to have lost track
of that info.

QA in non-academic circles can be an anti-quality driver someyimes can't it. 
Oh, let's give this thread some irrelevant legs...EXCEL!!!! You all know what I
am talking about ;O)

Regards,
Michael Grant


My little but serious list (HTH):


1.) US Environmental Protection Agency -- Dr.R. Woodrow Setzer of the USEPA and
a contributor to this list pointed out this comment in an EPA FIFRA Scientific
Advisory Review Panel report :

“The Panel also commends the EPA on the use of R (see the main EPA report for
references), as it is the best way to ensure portable, open code that is freely
available to all interested users, with state-of-the art algorithms for
statistical calculation.” -- FIFRA Scientific Advisory Panel ,
http://www.epa.gov/scipoly/sap/2001/september/finalreport.htm

A Set of Scientific Issues Being Considered by the Environmental Protection
Agency Regarding: 

Preliminary Cumulative Hazard and Dose-Response Assessment for Organophosphorus
Pesticides: Determination of Relative Potency and Points of Departure for
Cholinesterase  

R was also used in the N-methyl Carbamates cumulative risk assessment—link at
http://www.epa.gov/oscpmont/sap/2005/index.htm#august

2.) US National Institute of Standards and Techno logy (NIST), Statistical
Engineering Division
http://www.itl.nist.gov/div898/pubs/ar/SED2004.pdf
Collaborative research between members of the Statistical Engineering Division
(SED) and members of the Process Measurements Division (Chemical Sciences and
Technology Laboratory) has required that SED staff investigate various
statistical tools for data mining. These tools include some very powerful
statistical
classification/prediction methods for high-dimensional data. This article
briefly summarize this ongoing effort with the goal of bringing attention to a
wide array of methods in a statistical toolkit that is already easily available
to NIST scientists who may need them. Most of these functions have a
user-friendly interface in the open source environment R and widely available
commercial product S-plus.

3.) USDOE Department of Energy, Oak Ridge National Laboratory,
http://www.csm.ornl.gov/esh/aoed/ORNLTM2005ab52.htm
STATISTICAL METHODS AND SOFTWARE FOR THE ANALYSIS OF OCCUPATIONAL EXPOSURE DATA
WITH NON-DETECTABLE VALUES

Edward L. Frome
Computer Science and Mathematics Division
Oak Ridge National Laboratory

Paul F. Wambach
U. S. Department of Energy
Date Published:  September  2005
All of these methods are well known but computational complexity has limited
their use in routine data analysis with left censored data.  The recent
development of the R environment for statistical data analysis and graphics has
greatly enhanced the availability of high-quality nonproprietary (open source)
software that serves as the basis for implementing the methods in this paper. 
Numerical examples are provided and R(2004) functions are available at the
analysis of occupational exposure data  web site
http://www.csm.ornl.gov/esh/aoed/  (AOED).


4.) Historical Evaluation of the Film Badge Dosimetry Program at the Y-12
Facility in Oak Ridge, Tennessee, Part 1 – Gamma Radiation
J.P. Watkins1, G.D. Kerr2, E.L. Frome3, W.G. Tankersley1, and C.M. West+
ORAU Technical Report # 2004-0888
1Center for Epidemiologic Research, Oak Ridge Associated Universities
2Kerr Consulting Company
3Computer Science and Mathematics Division, Oak Ridge National Laboratory
+Deceased
This work was done under Contract No. 200-2002-00593 with the National
Institute for
Occupational Safety and Health.

5.) US FEMA http://www.fema.gov/txt/fhm/frm_cfd43.txt Flood 4.3 Flood frequency
analysis methods

At the end of this section:

"Several open-source and commercial software packages provide tools to assist
in the sorts of analyses discussed in this section. In particular, the S,
S-PLUS, and R programming languages (commercial and open-source versions of a
high-level statistical programming language) include comprehensive statistical
tools. The R language package is available for free from the web site
http://www.r-project.org/; several books discussing the use of R and S are
available. Other well-known software packages include Mathematica, Matlab,
SPSS, and SYSTAT."

6.) National Cancer Institute Advanced Biomedical Computing Center list R as
“available to staff” at
http://www1.ncifcrf.gov/app/htdocs/appdb/appinfo.php?appname=R-Project


7.) Weston, USACE and USEPA:
MODEL VALIDATION: MODELING STUDY OF PCB CONTAMINATION IN THE HOUSATONIC RIVER



--- "Peter Baker (CMIS, St Lucia)" <Peter.Baker at csiro.au> wrote:

> Hi
> 
> I apologise for this question as it really must be a FAQ. Unfortunately, 
> I can't find the answer and I'm tired of looking at endless google results
> 
> A colleague of mine works for a state government department that has a
> policy against open source software or software tainted by open
> source. Other government departments in the same state use R but this
> particular department is driven by very non-numerate people and
> superficially at least it appears somewhat backward IT-wise.  The
> department may purchase SPlus (which may be better for non programmer
> types anyway) or SPSS but it would nice to have the option to use R
> 
> The Q:
> 
> Are there any documents/reports/papers out there justifying R that
> comment on
> - quality of R
> - huge range of libraries available 
> - support (via a huge and enthusiastic user base - any ideas on how
>     many people use R)
> 
> I suspect that providing existing documents would carry more weight
> rather than writing a case from scratch or providing people's email
> opinions
> 
> Thanks in advance!
> 
> Cheers
> Peter
> 
> -- 
> Dr Peter Baker, Statistician (Bioinformatics/Genetics),
> CSIRO Mathematical & Information Sciences, Queensland Bioscience Precinct
> 306 Carmody Road, St Lucia Qld 4067.   Australia.
> Email: <Peter.Baker at csiro.au>  WWW: http://www.cmis.csiro.au/Peter.Baker/
> Phone: +61 7 3214 2210         Fax: +61 7 3214 2900
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list