[R] relative euclidean distance

S Ellison S.Ellison at LGCGroup.com
Thu Jul 7 12:22:44 CEST 2011


> -----Original Message-----
> > I would like to calculate the RELATIVE euclidean distance. 
> Is there a 
> > function in R which does it ?
> > 
> 
> A simple solution to this is to transform the data and then 
> compute the Euclidean distance using dist().
> 
> decostand(foo, method = "normalize") and
> 
> disttransform(foo, method = "chord") in package BiodiversityR 
> 

See also ?scale in the base package, which will centre and scale by sd by default.

But 'relative euclidean distance' is not that straightforward to explain. 'Relative' usually means 'divided by the true (or mean) value', or at least it does for most chemists. You almost certainly don't mean 'euclidean distance divided by mean euclidean distance'. I suspect - because I'm a chemist and it's what I'd by considering in your shoes - that what you're asking for is the euclidean distance between points defined by concentrations of your 94 analytes scaled by mean value. 

scale() will (by default) scale by dividing by centring on means and dividing by the sd, and that is usually the most sensible thing to do for multivariate data sets where the units or scales for each variable are very different.  Scaling by sd and scaling by mean value could give appreciably different answers. Although relative sd for chemical measurement is often near-constant over modest ranges, there is no particular reason to expect that the sd is strictly proportional to the mean over orders of magnitude, and in fact it generally isn't (relative SD tends to be larger for low-level analytes than for higher levels). The difference between the two would be essentially that if you divide by mean value, things with a large relative SD will tend to dominate the variations in 'distance', whereas if you centre and divide by SD, that won't happen to the same extent. But which option is more useful is hard to predict. 

Me, I think I'd try both and see which made most sense. 

Incidentally, scaling by mean value without centring using scale() would probably look something like

x.scaled <- scale(x, center=FALSE, scale=apply(x,2,mean))

assuming x has columns corresponding to your measurements and
dist(x.scaled) 
then gives you your distance matrix.


Steve E
lab of the Government Chemist
UK

*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list