[R] open source and R

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Mon Nov 14 00:14:01 CET 2005


On 13-Nov-05 Roger Bivand wrote:
> On Sun, 13 Nov 2005, Robert wrote:
> 
>> If I do not know C or FORTRAN, how can I fully understand the package
>> or possibly improve it?
> 
> By learning enough to see whether that makes a difference for your 
> purposes. Life is hard, but that's what makes life interesting ...
> 
>> Robert.
>> 
>> Roger Bivand <Roger.Bivand at nhh.no> wrote:
>> On Sun, 13 Nov 2005, Robert wrote:
>> 
>> > Roger Bivand wrote: 
>> > On Sun, 13 Nov 2005, Robert wrote:
>> > 
>> > > It uses FORTRAN code and not in pure R.
>> > 
>> > The same applies to deldir - it also includes Fortran. So the
>> > answer seems to be no, there is no voronoi function only
>> > written in R.
>> > 
>> 
>> Robert wrote:
>> 
>> > 
>> > I am curious about one thing: since the reason for using r
>> > is r is a easy-to-learn language and it is good for getting
>> > more people involved.
>> >
>> > Why most of the packages written in r use other languages
>> > such as FORTRAN's code? I understand some functions have
>> > already been written in other language or it is faster to
>> > be implemented in other language.
>> >
>> > But my understanding is if the user does not know that
>> > language (for example, FORTRAN), the package is still a
>> > black box to him because he can not improve the package and
>> > can not be involved in the development. 
>> >
>> > When I searched the packages of R, I saw many packages with
>> > duplicated or similar functions. the main difference among
>> > them are the different functions implemented using other
>> >languages, which are always a black box to the users. So it
>> > is very hard for users to believe the package will run
>> > something they need, let alone getting involved in the
>> > development. My comments are not to disregard these efforts.
>> > But it is good to see the packages written in pure R.
>> > 
>> 
>> Although surprisingly much of R is written in R, quite a lot is
>> written in Fortran and C. One very good reason, apart from
>> efficiency, is code
>> re-use
>> - BLAS and LAPACK among many others are excellent implementations
>> of what we need for numerical linear algebra. R is very typical
>> of good scientific software, it tries to avoid re-implementing
>> functions that are used by the community, are well-supported by
>> the community, and work. Packages by and large do the same - if
>> existing software does the required job, package authors attempt
>> to port that software to R, providing interfaces to underlying
>> C or Fortran libraries. 
>> 
>> It's about standing on the shoulders of giants.

Those are very strong points. Some comments:

It would be possible to implement in "pure R" a matrix inversion
or eigenvalue/vector function, for instance, and I'm sure it would
be done (if it were done) to very high quality. However, it would
run like an elephant in quicksands. BLAS and LAPACK have, over the
years, become highly optimised not just for accuracy and robustness,
but for speed and efficiency.

Also, you will hit the "other language" problem sooner or
later. Robert's complaint is that he does not like black
boxes. But R itself is a black box. You cannot write R in R,
all the way down to the bottom. At the bottom is machine
code, and languages like assember, C, C++, FORTRAN and
their compilers provide "black box" wrappers for this.

That is not a whimsical comment either -- all those discussions
about why  2 - sqrt(2)^2 is not equal to 0 come down to this
sort of issue. Sooner or later, if you really want to understand
what is going on, you have to get beneath the shiny smooth
surface and swim amongst the molecules!

So, Robert, try to be positive about C and FORTRAN etc., rather
than feeling put off by the fact that they are yet more things
to learn and seem to get in the way of understanding how the
functions work. C and FORTRAN are your friends, as well as
the R langauge itself, and great deal more friemdly than
the raw machine code. 

There is one aspect though where R users are in the cold when
it comes to C and FORTAN. If you want to understand the function
'eigen', say, then you can "?eigen" to learn about its usage.
You can enter "eigen" to see the R code, and indeed that is
not too imcomprehensible. But then you find

  .Fortran("ch", n, n, xr, xi, values = dbl.n, 
           !only.values, vectors = xr, ivectors = xi, dbl.n, 
           dbl.n, double(2 * n), ierr = integer(1),
           PACKAGE = "base")

and similar for "rs", "cg" and "rg". Where's the help for
these? Nowhere obvious! In fact you have to go to the source
code, locate the FORTRAN routines, and study these, hoping
that enough helpful comments have been included to steer
your study. So it is a much more formidable task, especially
if you are having to learn the language at the same time.

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 13-Nov-05                                       Time: 23:13:58
------------------------------ XFMail ------------------------------




More information about the R-help mailing list