[R] distance metrics

Gavin Simpson gavin.simpson at ucl.ac.uk
Tue Mar 13 00:21:22 CET 2007


On Mon, 2007-03-12 at 16:02 -0700, Sender wrote:
> Thanks for the suggestion Christian. I'm trying to avoid expanding the dist
> object to a matrix, since i'm usually working with microarray data which
> produces a distance matrix of size 5000 x 5000.
> 
> If i can keep it in its condensed form i think it will speed things up.
> 
> Is my thinking correct?

That will all depend on what you want to do with it...

A dist object of that size is c. 100 MB in memory, and c. 200 MB in size
as the full dissimilarity matrix - values from object.size(). Of course,
you'll need a reasonable amount of free memory over and above this to do
anything useful with the matrix as copies may be required during
analysis/processing etc.

Of course, a dist object is just a vector of observed distances with
various attributes, so one can always use "[" for vectors, but I imagine
that anything other than trivial operations will become fiddly,
complicated and time consuming - if you have the memory, give the
as.matrix option a try and see how it works for your specific problems.

G

> 
> 
> On 3/12/07, Christian Hennig <chrish at stats.ucl.ac.uk> wrote:
> >
> > On Mon, 12 Mar 2007, Sender wrote:
> >
> > > Hello:
> > >
> > > Does anyone know if there exists a package that handles methods for [
> > for
> > > dist objects?
> > >
> > > I would like to access a dist object using matrix notation
> > >
> > > e.g.
> > >
> > > dMat = dist(x)
> > > dMat[i,j]
> >
> > Try
> > dMat <- as.matrix(dist(x))
> >
> > Christian
> >
> >
> >
> > *** --- ***
> > Christian Hennig
> > University College London, Department of Statistical Science
> > Gower St., London WC1E 6BT, phone +44 207 679 1698
> > chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [t] +44 (0)20 7679 0522
ECRC                              [f] +44 (0)20 7679 0565
UCL Department of Geography
Pearson Building                  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street
London, UK                        [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT                          [w] http://www.freshwaters.org.uk/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list