[R] Statistical computing

Douglas Bates bates at stat.wisc.edu
Mon Mar 31 18:12:57 CEST 2003


"Bashir Saghir (Aztek Global)" <Saghir.Bashir at ucb-group.com> writes:

> <snip>
> >Saghir, why do you prefer Python?
> <snip>
> 
> I was thinking about learning Perl many years ago and I asked my system
> admin for advice. His enthusiasm for Python steered me away from Perl and
> I've been hooked since. Basically it is easy to learn and program
> development is quick. 

Python 'feels' very much like R to me (and a bit like Java too).  Perl
is great for the two-minute hack to accomplish an awkward
transformation but the saying in the Python community is that "Hell is
reading someone else's perl code".  It could also be said that
"Purgatory is reading your own perl code from more than a few weeks
ago".

I like the IDE for Python.  For me working in Python is similar to
working in R in that I have a window open where I am writing the code
and another interactive window where I can send snippets of the code
for execution, say to test out exactly what the result of some
expression is.  Once you get beyond the initial shock of discovering
that the indentation of code in python determines the lexical grouping
you find that python is a clean, well-structured language.  I don't
think the same can be said for perl.

Tanya mentioned data management and data cleaning.  I think the
combination of a relational database management system, such as MySQL
or PostgreSQL, and Python and R is very powerful for data cleaning.
Python can be used for sequential processing and for loading the
database.  SQL can be used for examining the structure of the data and
for detecting unusual cases.  R can be used for model fitting and
graphics on the entire data set, if it is not huge, or on a subset, if
it is huge.

Together I find this combination more powerful than SAS or SPSS and
definitely faster.  However, using this combination requires learning
three different languages.



More information about the R-help mailing list