[R] Very Slow Gower Similarity Function
Anon.
bob.ohara at helsinki.fi
Mon Apr 18 19:36:56 CEST 2005
Jari Oksanen wrote:
>
> On 18 Apr 2005, at 19:10, Tyler Smith wrote:
>
>> Hello,
>>
>> I am a relatively new user of R. I have written a basic function to
>> calculate
>> the Gower similarity function. I was motivated to do so partly as an
>> excercise
>> in learning R, and partly because the existing option (vegdist in the
>> vegan
>> package) does not accept missing values.
>>
> Speed is the reason to use C instead of R. It should be easy, almost
> trivial, to modify the vegdist.c so that it handles missing values. I
> guess this handling means ignoring the value pair if one of the values
> is missing -- which is not so gentle to the metric properties so dear
> to Gower. Package vegan is designed for ecological community data
> which generally do not have missing values (except in environmental
> data), but contributions are welcome.
>
The only reason you never see ecological community data with missing
values is because the ecologists remove those species/sites from their
Excel sheets before they give it to you to sort out their mess. This is
actually one of the few things they know how to do in Excel - I'm
dreading the day when a paper appears in JAE saying that you can use
Excel to produce P-values.
To be slightly more serious, as an exercise the OP could consider
writing a wrapper function in R that removes the missing data and then
calls vegdist to calculate his Gower similarity index.
Bob
--
Bob O'Hara
Department of Mathematics and Statistics
P.O. Box 68 (Gustaf Hällströmin katu 2b)
FIN-00014 University of Helsinki
Finland
Telephone: +358-9-191 51479
Mobile: +358 50 599 0540
Fax: +358-9-191 51400
WWW: http://www.RNI.Helsinki.FI/~boh/
Journal of Negative Results - EEB: www.jnr-eeb.org
More information about the R-help
mailing list