[R-sig-Geo] joining envelope objects for parallel computing. Possible?

Quets Jan Jan.Quets at ua.ac.be
Mon Sep 12 13:59:30 CEST 2011


Thanks for the receiven ideas,

@Rolf:

the sim_pat_list I already managed to parallelize,

and indeed calculating summary functions is generally faster,
but in my case still takes up about 20% (or 2 hours) of the total time (about 10 hours) so far. 

However, I must say that most of the time is spend with the 

alltypes() function (for bivariate point patterns) and not so much with the envelope() function.


I'm hoping the received feedback is also applicable for alltypes(),

I will try,

Jan   




________________________________________
Van: Rolf Turner [r.turner at auckland.ac.nz]
Verzonden: maandag 12 september 2011 12:39
Aan: Quets Jan
CC: r-sig-geo at r-project.org
Onderwerp: Re: [R-sig-Geo] joining envelope objects for parallel computing. Possible?

On 12/09/11 21:03, Quets Jan wrote:
> Hi,
>
> is it possible to join several envelope (spatstat) objects into a single object?

     See fortune("This is R")

     You could do it, I think, at the expense of writing a bit of additional
     code.  Something along the following lines:

     * For each of your calls to envelope, use the argument savefuns=TRUE.
     * From each of the ENV_OBJ_j, extract the "simfuns" attribute.
     * Change the class of each extracted "simfuns" attribute to
"data.frame"
        (from c("fv","data.frame")), discard the "r" columns, and then cbind
        them all together; make the result into a *matrix* rather than a
        data frame.  Call it, say, "M".
     * apply the appropriate function across the rows of M, to obtain "lo"
        and "hi"; e.g.

             LH <- t(apply(M,1,function(x,m){x <- sort(x);
c(x[m],x[length(x)-m+1])},m=5))

        The value "m=5" corresponds to setting "nrank=5" in a ``direct''
call to
        envelope().  The first column of LH is "lo"; the second is "hi".

     * These vectors, "lo" and "hi" are then the lower and upper bounds
        respectively of the required envelope.

     The foregoing can be wrapped up in a convenient function.
     Exercise for the reader! :-)

     HTH

         cheers,

             Rolf

P. S.  Not clear to me that this is worth doing; the time consuming part
is the
simulation of the patterns, which you've already got in the
"sim_pat_list".  Once
you have the patterns, calculating the summary functions is usually very
fast,
and hence not worth parallelizing. Unless you have an extraordinarily
complicated
setting.

             R.
> The reason would be saving computation time with parallel computing.
>
> example:
>
> suppose I already have a list of 120 simulated patterns, called sim_pat_list.
>
> and suppose this is what I want:
>
> ENV_OBJ = envelope(X,fun=pcf,simulate=sim_pat_list)
>
>
> but suppose I have 12 cores available.
>
> and that I split the list sim_pat_list into 10 lists, each comprising 12 patterns, and calculate herefrom 10 envelope objects:
>
> ENV_OBJ_1 = envelope(X,fun=pcf,simulate=sim_pat_list1)
> ENV_OBJ_2 = envelope(X,fun=pcf,simulate=sim_pat_list2)
>
> ENV_OBJ_10 = envelope(X,fun=pcf,simulate=sim_pat_list10)
>
>
> is it then possible to produce something like this (pseudo-code):
>
> ENV_OBJ = join(ENV_OBJ_1, ENV_OBJ_2, ..., ENV_OBJ_10)
>
> to get the same result?
>
> Thank you,
> Jan
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo




More information about the R-sig-Geo mailing list