[R] adding a method to the dist function

Giampiero Salvi giampi at speech.kth.se
Mon May 3 14:01:23 CEST 2004


On Mon, 3 May 2004, Prof Brian Ripley wrote:

> dist() compares pairs of rows in the x matrix.  How can they have `means
> and covariances'? -- you have a sample of size one from each of two
> populations.
>
> It seems that (Gaussian) Bhattacharyya is more like mahalanobis().

I had planned to use mean vectors and covariance matrices I computed
over N groups of data samples as input to dist, like this

mu_1_1 mu_1_2 ... mu_1_M cov_1_1_1 cov_1_1_2 ... cov_1_M_M
mu_2_1 mu_2_2 ... mu_2_M cov_2_1_1 cov_2_1_2 ... cov_2_M_M
...
mu_N_1 mu_N_2 ... mu_N_M cov_N_1_1 cov_N_1_2 ... cov_N_M_M

where N is the number of groups and M the dimension.

I agree that it would be better to use a new function (similar to
mahalanobis), as the function dist in all the other cases uses raw
data samples, and my interpretation of the input data might be
confusing. The reason why I though of dist is that bhattacharyya is
a symmetrical distance, and the result fits well the dist class.

One way to solve this, if you agree, would be to write a new function
bhattacharyya() that returns a dist object.

Giampiero




More information about the R-help mailing list