[R] Euclidean distance function

Arbuckle k.arbuckle at liverpool.ac.uk
Fri Aug 24 12:56:51 CEST 2012


Hi,

I should preface this problem with a statement that although I am sure this
is a really easy function to write, I have tried and failed to get my head
around writing functions in R. I can use R where functions exist to do what
I want done, but have found myself completely incapable of writing them
myself.

The problem is that I have a table with several rows of species and several
columns of trait data for each species. Now what I want to do is, for each
possible pair of species, extract the Euclidean distance between them based
on specified trait data columns. While as far as I can see the dist()
function could manage this to some extent for 2 dimensions (traits) for each
species, I need a more generalised function that can handle n-dimensions.
Ideally this function would allow me to choose which columns (traits) to use
to calculate the Euclidean distance rather than having to reformat the
dataset every time.

In the hope of clarifying this with a simplified example, I want to take a
dataset like this:

Species     x          y          z          n
spA          2.9     34.2     0.54    15.7
spB          5.5     46.5     0.45    19.4
spC          1.4     48.6     0.84    24.8
spD          8.3     56.1     0.48    21.3

Then extract the Euclidean distances using the general equation
d=sqrt[(x2-x1)^2+(y2-y1)^2+...+(n2-n1)^2] for particular data columns. So in
this example I might want the distances using the traits x, z and n, thereby
specifying the equation to be d=sqrt[(x2-x1)^2+(z2-z1)^2+(n2-n1)^2], and
return a distance matrix as follows (calculated distances represented by .
for the purposes of this example):

Species     spA     spB     spC
spB              .
spC              .           .
spD              .           .           .

I hope this makes sense. I only presume that this would be a quick and easy
function to write on the basis that the underlying process is basically
simple maths repeated for each pair of species. Again I have no experience
in writing custom functions (no matter how simple) and just can't seem to
get into my head how to go about it.

I look forward to your response and hope someone gets bored enough to
quickly write out the code to implement this function. Thank you in advance.

Best wishes,

Kev



--
View this message in context: http://r.789695.n4.nabble.com/Euclidean-distance-function-tp4641177.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list