[Rd] Julia

William Dunlap wdunlap at tibco.com
Thu Mar 8 23:27:22 CET 2012


I guess my point is not getting across.  The user should see
the functional programming style but under the hood the
evaluator should be able to use whatever memory and time
saving tricks it can.  Julia seems to want to be a nonfunctional
language, which I think makes it harder to write the sort of
easily reusable functions that S allows.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: oliver [mailto:oliver at first.in-berlin.de]
> Sent: Thursday, March 08, 2012 2:23 PM
> To: William Dunlap
> Cc: R-devel
> Subject: Re: [Rd] Julia
> 
> I don't think that using in-place modification as a general property would make
> sense.
> 
> In-place modification brings in side-effects and that would mean that the order
> of evaluation can change the result.
> 
> To get reliable results, the order of evaluation should not be the reason for
> different results, and thats the reason, why the functional approach is much
> better for reliable programs.
> 
> So, in general I would say, this feature is a no-no.
> In general I would rather discourage in-place modification.
> 
> For some certain cases it might help...
> but for such certain cases either such a boolean flag or programming a sparate
> module in C would make sense.
> 
> There could also be a global in-place-flag that might be used (via options
> maybe) but if such a thing would be implemented, the default value should be
> FALSE.
> 
> 
> 
> Ciao,
>    Oliver
> 
> 
> On Thu, Mar 08, 2012 at 04:21:42PM +0000, William Dunlap wrote:
> > So you propose an inplace=TRUE/FALSE entry for each argument to each
> > function which may may want to avoid allocating memory?  The major
> > problem is that the function writer has no idea what the value of
> > inplace should be, as it depends on how the function gets called.
> > This makes writing reusable functions (hence packages) difficult.
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> > > -----Original Message-----
> > > From: oliver [mailto:oliver at first.in-berlin.de]
> > > Sent: Thursday, March 08, 2012 7:40 AM
> > > To: William Dunlap
> > > Cc: R-devel
> > > Subject: Re: [Rd] Julia
> > >
> > > Ah, and you mean if it's an anonymous array it could be reused
> > > directly from the args.
> > >
> > > OK, now I see why you insist on the anonymous data thing.
> > > I didn't grasped it even in my last mail.
> > >
> > >
> > >
> > > But that somehow also relates to what I wrote about reusing an
> > > already existing, named vector.
> > >
> > > Just the moment of in-place-modification is different.
> > >
> > > From
> > >   x  <- runif(n)
> > >   cx <- cos(x)
> > >
> > > instead of
> > > > >     cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > > space for the return value
> > >
> > > to something like
> > >
> > >   cx  <- runif(n)
> > >   cos( cx, inplace=TRUE)
> > >
> > > or
> > >
> > >   cos( runif(n), inplace=TRUE)
> > >
> > >
> > >
> > >
> > > This way it would be possible to specify the reusage of the input
> > > *explicitly* (without  implicit rules like anonymous vs. named values).
> > >
> > >
> > >
> > > In Pseudo-Code something like that:
> > >
> > >    if (in_place == TRUE )
> > >    {
> > >      input_val[idx] = cos( input_val[idx] );
> > >      return input_val;
> > >    }
> > >    else
> > >    {
> > >      result_val = alloc_vec( LENGTH(input_val), ... );
> > >      result_val[idx] = cos( input_val[idx] );
> > >      return result_val;
> > >    }
> > >
> > >
> > >
> > > Is this matching, what you were looking for?
> > >
> > >
> > > Ciao,
> > >    Oliver
> > >
> > >
> > > On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
> > > > Hi,
> > > >
> > > > ok, thank you for clarifiying what you meant.
> > > > You only referred to the reusage of the args, not of an already
> > > > existing vector.
> > > > So I overgenerealized your example.
> > > >
> > > > But when looking at your example,
> > > > and how I would implement the cos() I doubt I would use copying
> > > > the args before calculating the result.
> > > >
> > > > Just allocate a result-vector, and then place the cos() of the
> > > > input-vector into the result vector.
> > > >
> > > > I didn't looked at how it is done in R, but I would guess it's
> > > > like that.
> > > >
> > > >
> > > >   In pseudo-Code something like that:
> > > >     cos_val[idx] = cos( input_val[idx] );
> > > >
> > > > But R also handles complex data with cos() so it will look a bit
> > > > more laborious.
> > > >
> > > > What I have seen so far from implementing C-extensions for R is
> > > > rather C-ish, and so you have the control on many details. Copying
> > > > the input just to read it would not make sense here.
> > > >
> > > > I doubt that R internally is doing that.
> > > > Or did you found that in the R-code?
> > > >
> > > > The other problem, someone mentioned, was *changing* the contents
> > > > of a matrix... and that this is NO>T done in-place, when using a
> > > > function for it.
> > > > But the namespace-name / variable-name as "references" to the
> > > > matrix might solve that problem.
> > > >
> > > >
> > > > Ciao,
> > > >   Oliver
> > > >
> > > >
> > > >
> > > > On Wed, Mar 07, 2012 at 07:10:43PM +0000, William Dunlap wrote:
> > > > > No my examples are what I meant.  My point was that a function,
> > > > > say cos(), can act like it does call-by-value but conserve
> > > > > memory when it can  if it can distinguish between the case
> > > > >     cx <- cos(x=runif(n)) # no allocation needed, use the input
> > > > > space for the return value and and the case
> > > > >    x <- runif(n)
> > > > >    cx <- cos(x=x) # return value cannot reuse the argument's
> > > > > memory, so
> > > allocate space for return value
> > > > >    sum(x)              # Otherwise sum(x) would return sum(cx)
> > > > > The function needs to know if a memory block is referred to by a
> > > > > name in any environment in order to do that.
> > > > >
> > > > > Bill Dunlap
> > > > > Spotfire, TIBCO Software
> > > > > wdunlap tibco.com
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: oliver [mailto:oliver at first.in-berlin.de]
> > > > > > Sent: Wednesday, March 07, 2012 10:22 AM
> > > > > > To: Dominick Samperi
> > > > > > Cc: William Dunlap; R-devel
> > > > > > Subject: Re: [Rd] Julia
> > > > > >
> > > > > > On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
> > > > > > > On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap
> > > > > > > <wdunlap at tibco.com>
> > > > > > wrote:
> > > > > > > > S (and its derivatives and successors) promises that
> > > > > > > > functions will not change their arguments, so in an
> > > > > > > > expression like
> > > > > > > >   val <- func(arg)
> > > > > > > > you know that arg will not be changed.  You can do that by
> > > > > > > > having func copy arg before doing anything, but that uses
> > > > > > > > space and time that you want to conserve.
> > > > > > > > If arg is not a named item in any environment then it
> > > > > > > > should be fine to write over the original because there is
> > > > > > > > no way the caller can detect that shortcut.  E.g., in
> > > > > > > >    cx <- cos(runif(n))
> > > > > > > > the cos function does not need to allocate new space for
> > > > > > > > its output, it can just write over its input because,
> > > > > > > > without a name attached to it, the caller has no way of
> > > > > > > > looking at what
> > > > > > > > runif(n) returned.  If you did
> > > > > > > >    x <- runif(n)
> > > > > > > >    cx <- cos(x)
> > > > > >
> > > > > > You have two names here, x and cx, hence your example does not
> > > > > > fit into what you want to explain.
> > > > > >
> > > > > > A better example would be:
> > > > > > x <- runif(n)
> > > > > > x <- cos(x)
> > > > > >
> > > > > >
> > > > > >
> > > > > > > > then cos would have to allocate new space for its output
> > > > > > > > because overwriting its input would affect a subsequent
> > > > > > > >    sum(x)
> > > > > > > > I suppose that end-users and function-writers could learn
> > > > > > > > to live with having to decide when to copy, but not having
> > > > > > > > to make that decision makes S more pleasant (and safer) to use.
> > > > > > > > I think that is a major reason that people are able to
> > > > > > > > share S code so easily.
> > > > > > >
> > > > > > > But don't forget the "Holy Grail" that Doug mentioned at the
> > > > > > > start of this thread: finding a flexible language that is
> > > > > > > also fast. Currently many R packages employ C/C++ components
> > > > > > > to compensate for the fact that the R interpreter can be
> > > > > > > slow, and the pass-by-value semantics of S provides no protection
> here.
> > > > > > [...]
> > > > > >
> > > > > > The distinction imperative vs. functional has nothing to do
> > > > > > with the distinction interpreted vs. directly executed.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thinking again on the problem that was mentioned here, I think
> > > > > > it might be circumvented.
> > > > > >
> > > > > > Looking again at R's properties, looking again into U.Ligges
> > > > > > "Programmieren in R", I saw there was mentioned that in R
> > > > > > anything
> > > > > > (?!) is an object... so then it's OOP; but also it was
> > > > > > mentioned, R is a functional language. But this does not mean
> > > > > > it's purely functional or
> > > has no imperative data structures.
> > > > > >
> > > > > > As R relies heavily on vectors, here we have an imperative
> datastructure.
> > > > > >
> > > > > > So, it rather looks to me that "<-" does work in-place on the
> > > > > > vectors, even
> > > "<-"
> > > > > > itself is a function (which does not matter for the problem).
> > > > > >
> > > > > > If thats true (I assume here, it is; correct me, if it's
> > > > > > wrong), then I think, assigning with "<<-" and assign() also
> > > > > > would do an imperative
> > > > > > (in-place) change of the contents.
> > > > > >
> > > > > > Then the copying-of-big-objects-when-passed-as-args problem
> > > > > > can be circumvented by working on either a variable in the
> > > > > > GlobalEnv (and using "<<-", or using a certain environment for
> > > > > > the big data and passing it's name (and the
> > > > > > variable) as value to the function which then uses assign()
> > > > > > and
> > > > > > get() to work on that data.
> > > > > > Then in-place modification should be possible.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > In 2008 Ross Ihaka and Duncan Temple Lang published the
> > > > > > > paper "Back to the Future: Lisp as a base for a statistical
> > > > > > > computing system" where they propose Common Lisp as a new
> > > > > > > foundation for R. They suggest that this could be done while
> > > > > > > maintaining the same
> > > familiar R syntax.
> > > > > > >
> > > > > > > A key requirement of any strategy is to maintain easy access
> > > > > > > to the huge universe of existing C/C++/Fortran numerical and
> > > > > > > graphics libraries, as these libraries are not likely to be rewritten.
> > > > > > >
> > > > > > > Thus there will always be a need for a foreign function
> > > > > > > interface, and the problem is to provide a flexible and
> > > > > > > type-safe language that does not force developers to use
> > > > > > > another unfamiliar, less flexible, and error-prone language
> > > > > > > to optimize the hot
> > > spots.
> > > > > >
> > > > > > If I here "type safe" I rather would think about OCaml or
> > > > > > maybe Ada, but not LISP.
> > > > > >
> > > > > > Also, LISP has so many "("'s and ")"'s, that it's making
> > > > > > people going crazy ;-)
> > > > > >
> > > > > > Ciao,
> > > > > >    Oliver
> > > >
> > > > ______________________________________________
> > > > R-devel at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel


More information about the R-devel mailing list