[R] Calculating the distance samples using distance metics method

Bill.Venables at csiro.au Bill.Venables at csiro.au
Wed Feb 20 00:58:46 CET 2008


Distance matrices are not usually and end in themselves but a means to
some other end.  Rather than ask what is the best way to calculate such
a huge distance matrix, maybe the question you should ask yourself is
what are you going to do with it if ever you did manage to calculate it.

Maybe you can bypass the distance matrix calculation and get to the end
point by some other means.  For example, if the eventual goal is
clustering, then perhaps something like clara() in the 'cluster' package
will do the job more effectively.  It is designed to handle situations
of this kind.


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables at csiro.au
http://www.cmis.csiro.au/bill.venables/ 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Keizer_71
Sent: Wednesday, 20 February 2008 9:35 AM
To: r-help at r-project.org
Subject: [R] Calculating the distance samples using distance metics
method


***********reading in data**********

data<-read.table("microarray.txt",header=T, sep="\t")

head(data)

dim(data)

attach(data)

***********creating matrix and calculating variance across
probesets********


x<-1:20000

y<-2:141

data.matrix<-data.matrix(data[,y])

variableprobe<-apply(data.matrix[x,],1,var)

hist(variableprobe)

**************filter out low variance*************

data.sub = data.matrix[order(variableprobe,decreasing=TRUE),][1:10000,]

dim(data.sub)
[1] 10000   140

What is the best way to calculate the distances between the samples
using
the euclidean or manhattan distance metrics?


any suggestions?
-- 
View this message in context:
http://www.nabble.com/Calculating-the-distance-samples-using-distance-me
tics-method-tp15578860p15578860.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list