[R] More simple implementation is slow.
peter dalgaard
pdalgd at gmail.com
Sat Jun 9 13:07:51 CEST 2012
On Jun 9, 2012, at 11:08 , wl2776 wrote:
> Hi all.
> I'm developing a function, which must return a square matrix.
>
> Here is the code:
> http://pastebin.com/THzEW9N7
>
> These functions implement an analog of two embedded for cycles.
>
> The first variant creates the resulting matrix by columns, cbind()-ing them
> one by one.
> The second variant creates the matrix with two columns, which rows contain
> all possible
> variants of i and j and calls apply on them.
>
> The test input (data frame cp.table) can be produced with the following
> commands:
>> n<-132
>> cpt<-data.frame(x=runif(n, min=0, max=100), y=runif(n, min=0, max=100),
>> la=runif(n, min=0, max=360), phi=runif(n, min=-90, max=90))
> Any random data will do.
>
> The second variant seems to me much more readable and beauteful.
> However, the first ugly variant runs much faster.
> Why??
> Here are the profiles:
>
Nope, they weren't...
Anyways, you're effectively looping over N^2 (i,j) combinations, with complex indexing all the way, without making proper use of vectorization. As far as I can tell, what you're doing is effectively
with(cp.table,
sqrt(outer(x, x, "-")^2 + outer(y, y, "-")^2)
)
or even
dist(cptable[1:2])
both of which should be a good deal faster.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list