[Rd] Wish list
M. Edward (Ed) Borasky
znmeb at cesmail.net
Sun Jan 1 20:08:42 CET 2006
Duncan Temple Lang wrote:
>And while we are on the topic of wishlists...
>Generally (i.e. not directed specifically to Gabor),
>the suggestions are very welcome, but so are contributions.
>And for issues such as making the existing R available on handhelds,
>that is a programming task.
>
Hasn't someone ported R to the Sharp Zaurus, for which both the Linux
kernel and a more or less complete GNU toolchain exist, plus at least
two GUI builders? I've forgotten what the compiler version is -- it
might be back around 2.95.
In any event, one of the Lisps and Maxima have been ported to the
Zaurus. I'm not sure how well a number crunching application like R
would run on the Zaurus processor, though -- IIRC the floating point is
emulated in software. Isn't the same true for Palms and Windows CE PDAs?
>And I draw a large distinction between
>programming and creative research which is based on new concepts and
>paradigms. The pool of people working in statistical computing research
>is very small. And to a large extent, their time is consumed with
>programming - making the same thing work on multiple platforms,
>correcting documentation, etc. which are good things, but
>not obviously the best use of available research ability and time.
>There are many more topics that are in progress that represent
>changes to what we can do rather than just to how we do the same thing.
>
>
I'd much rather have changes to what we can do rather than how we do the
same thing! As the Perl folks say, "There's more than one way to do it!"
So keep R and its contributed packages focused on making the first few
ways to do something new!
>One of the reasons S (R and S-Plus) is where it is now
>is because in Bell Labs, the idea was to be thinking
>5 years ahead and both meeting and directing the needs for the future.
>Because of R's popularity (somewhat related to it being free), there is
>an aspect of development that focuses more on software for statisticians
>to use "right now".
>Obviously, th development is a mixture of both the current and the
>future, but there is less of the future and certainly less of the
>longer term directions that is sacrificed by the need to maintain an
>existing system and be backward-compatible.
>If statistics is to fulfill its potential in this modern IT, we need new
>ideas and research into those new ideas. If we focus on basic
>programming tasks (however complex) and demand usability above concepts,
>we risk losing those whose primary focus is in statistical computing
>research from the field.
>
>
Amen! Please don't turn R into Perl! The Perl community has statistical
libraries for the basics. If that's all you want to do, just learn how
to do it in Perl. The same goes for Python and Ruby. All the scripting
languages can be used for basic statistical and numeric processing, and
their communities are adding libraries for more advanced functionality
all the time.
But no other language/community has the breadth of advanced statistical
processing that R and its contributed packages have, and no other
language has the right core semantics to make this kind of computing
easy, with the possible exception of the newest dialects of Fortran. I
*could* write a web ecommerce site in R if I wanted to, but why would I?
I'd do that in PHP or the new Ruby on Rails, because that's what those
languages were designed to do well!
>While R provides statisticians and stat. comp. researchers with a
>terrific vehicle for doing their respective work, it also acts as
>a constraint for doing anything even moderately new. But much (not all)
>of R is based on innovations from the 1970's, 80's and 90's. And
>as IT evolves at a terrific pace, to keep up with it, we need to be
>forward looking.
>
>
Could you elaborate on the nature of the constraints R imposes?
Obviously there are *time* constraints made necessary by the programming
tasks and finite number of community members, but are there limits to
the kinds of scientific/statistical computing thoughts one can think if
one only uses R and its contributed packages?
>I'll leave it there - for the moment - and go fight off the ants
>that are invading my desk! While I wrote this down relatively
>rapidly, the ideas have been brewing for a long time. If anyone
>wishes to comment on the theme, I hope they will take a few minutes
>to think about the broad set of issues and tradeoffs.
>
>
I've been thinking about related issues over the holiday break, mostly
triggered by Paul Graham's essay on a programming language that would
last 100 years. The essay will appear on my blog in the near future.
Meanwhile, I'll add my wish list (and list of things I'd work on in my
spare time if I had any :) ) for R.
1. An integrated symbolic math capability. I think packaging GiNaC
(http://freshmeat.net/projects/ginac/) is the logical way to do this.
GiNaC is a C++ library, and I suspect it could be easily packaged, but I
haven't tried it yet. If someone is ahead of me on this, I'd like to
know about it before I attempt it.
2. A good solid discrete time and continuous time Markov chain analyzer
for use in computer performance analysis. There are quite a few good
toolsets out there, some with GUIs and some without, but nearly all of
them have licenses that are not free as in speech. They're freely
obtainable in the academic community, but not for "commercial use".
There is one exception, and if I followed the path of integrating an
existing package, I'd go with Prism (http://www.cs.bham.ac.uk/~dxp/prism/).
3. Along the lines of 2, more "out-of-core" solver capabilities. I don't
think it's going to be much longer before a "typical scientific
researcher" in a domain like bioinformatics or computer performance
analysis will have available a two (physical 64-bit) processor 4GB
workstation with a terabyte of local disk, plus, of course, access to a
grid for the "big problems." :) At the moment, I don't have any computer
performance analysis problems with enough states to require an efficient
out-of-core solver, but it's bound to happen.
--
M. Edward (Ed) Borasky
http://borasky-research.blogspot.com
More information about the R-devel
mailing list