[R] Very Slow Gower Similarity Function
Jari Oksanen
jari.oksanen at oulu.fi
Mon Apr 18 18:58:29 CEST 2005
On 18 Apr 2005, at 19:10, Tyler Smith wrote:
> Hello,
>
> I am a relatively new user of R. I have written a basic function to
> calculate
> the Gower similarity function. I was motivated to do so partly as an
> excercise
> in learning R, and partly because the existing option (vegdist in the
> vegan
> package) does not accept missing values.
>
Speed is the reason to use C instead of R. It should be easy, almost
trivial, to modify the vegdist.c so that it handles missing values. I
guess this handling means ignoring the value pair if one of the values
is missing -- which is not so gentle to the metric properties so dear
to Gower. Package vegan is designed for ecological community data which
generally do not have missing values (except in environmental data),
but contributions are welcome.
> I think I have succeeded - my function gives me the correct values.
> However, now
> that I'm starting to use it with real data, I realise it's very slow.
> It takes
> more than 45 minutes on my Windows 98 machine (R 2.0.1 Patched
> (2005-03-29))
> with a 185x32 matrix with ca 100 missing values. If anyone can suggest
> ways to
> speed up my function I would appreciate it. I suspect having a pair of
> nested
> for loops is the problem, but I couldn't figure out how to get rid of
> them.
cheers, jari oksanen
--
Jari Oksanen, Oulu, Finland
More information about the R-help
mailing list