[R] repeating a function across a data frame

JenniferH jenachobbs at gmail.com
Wed Aug 1 11:42:48 CEST 2012


Hello everyone.  Like others on this list, I'm new to R, and really not much
of a programmer, so please excuse any obtuse questions!  I'm trying to
repeat a function across all possible combinations of vectors in a data
frame.  I'd hugely appreciate any advice!

Here's what I'm doing:

I have some data: 40 samples, ~460 000 different readings between 1 and 0
for each sample.  I would like to make R spit out a matrix of distances
between the samples.  So far, I have made a function to calculate the
distance between any two samples:

DistanceCalc<-function(x,y){#x and y are both vectors - the entire reading
set for sample x and
#sample y, respectively
  distance<-sqrt(sum((x-y).^2))
  distanceCorrected<-distance/sqrt(length(x))#to force the maximum possible
value to =1
  print(distanceCorrected)
}

The next thing I want to do is to make this function run to compare all
possible combinations of my samples (1vs1, 1vs2, 1vs3...2vs1, 2vs2 etc).  In
python, the only other programming language I have ever used, I would just
use a "for" loop.  I have asked the internet how to do this, but the
overwhelming response seems to be "you don't want to do it like that - use
the 'apply' functions".  I've tried to use the apply functions, but I tend
to find that I can only give my DistanceCalc function a single vector (I can
tell it where to find x, but not where to find y, or vice versa).  I've also
found the 'by' and the 'outer' functions, but I'm likewise failing at making
those work, e.g.

> distancetable<-outer(DataWithoutBlanks,DataWithoutBlanks,FUN=DistanceCalc)
Error in x - y : non-numeric argument to binary operator

I think this may be because my data has headers and the function is trying
to calculate the difference between the names of my samples, but I don't
know how to correct this.

Would really appreciate your help!

Jen



--
View this message in context: http://r.789695.n4.nabble.com/repeating-a-function-across-a-data-frame-tp4638643.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list