[R] Simulation

Thu May 14 08:26:03 CEST 2009

Barry Rowlingson wrote:
> On Wed, May 13, 2009 at 9:56 PM, Wacek Kusnierczyk
> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>   
>> Barry Rowlingson wrote:
>>     
>
>   
>>    n = 1000
>>    benchmark(columns=c('test', 'elapsed'), order=NULL,
>>       'for'={ l = list(); for (i in 1:n) l[[i]] = rnorm(i, 0, 1) },
>>       lapply=lapply(1:n, rnorm, 0, 1) )
>>    #     test elapsed
>>    # 1    for   9.855
>>    # 2 lapply   8.923
>>
>>
>>     
>>> Yes, you can probably vectorize this with lapply or something, but I
>>> prefer clarity over concision when dealing with beginners...
>>>       
>> but where's the preferred clarity in the for loop solution?
>>     
>
>  Seriously? You think:
>
>  lapply(1:n, rnorm, 0, 1)
>
> is 'clearer' than:
>
> x=list()
> for(i in 1:n){
>   x[[i]]=rnorm(i,0,1)
> }
>
> for beginners?
>   

seriously, i do;  but it does depend on who those beginners are.  if
they come directly from c and the like, you're probably right.

>  Firstly, using 'lapply' introduces a function (lapply) that doesn't
> have an intuitive name. Also, it takes a function as an argument. The
> concept of having a function as a parameter to another function is
> something that a lot of programming beginners have trouble with -
> unless they were brought up on LISP of course, and few of us are.
>   

well, that's one of the first things you learn on a programming
languages course that is not procedural programming-{centered,biased}. 
no need for prior lisp experience.  if messing with closures in not
involved (as here), no need for advanced discussion is needed.

also, the for looping may not be as trivial stuff to explain as you
might think.  note, you're talking about r, not c, and the treatment of
iterator variables in for loops in scripting languages differs:

    perl -e '
       $i = 0;
       for $i (1..5) { # iterate with $i
           };
       print "$i\n" '
    # 0

    ruby -e '
       i = 0
       for i in 1..5 # iterate with i
           end
       printf "%d\n", i '
    # 5

and you've gotten into explaining lexical scoping etc.

>  I propose that the for-loop example is clearer to a larger population
> than the lapply version. 

which population have you sampled from?  you may not be wrong, but give
some data.

> Plus it's only useful in that form if the
> first parameter is the one you want to lapply over. If you want to
> work over the third parameter, say, you then need:
>
>  lapply(1:n,function(i){rnorm(100,0,i)})
>
>  at which point you've introduced anonymous functions. The jump from:
>
>  x[[i]] = rnorm(i,0,1)
> to
>  x[[i]] = rnorm(100,0,i)
>
> is much less than the changes in the lapply version, where you have to
> go 'oh hang on, lapply only works on the first argument, so you have
> to write another function, but you can do that inline like this...'.
>   

you may be unhappy to learn that you're unaware of how the lapply
solution can still be nicely adapted here:

    lapply(1:n, rnorm, n=100, mean=0)

> Okay, maybe my example is a little contrived, but I still think for a
> beginners context it's important not to jump too many paradigms at a
> time.
>   

for a complete beginner, jump into for loops may not be that trivial as
you seem to think.  there's still quite some stuff to be explained to
clarify that

    i = 0
    for (i in 1:n)
       # do stuff
    print(i)

will print n, not 0.  unless n=0, of course.

vQ