[R] help for efficient loop

Takatsugu Kobayashi tkobayas at indiana.edu
Thu Mar 8 05:22:07 CET 2007


Hi,

I have been trying to minimize computation times in the following loops. 
I could successfully use lapply to minimize a lot simpler stuff. So I am 
trying to use lapply or sapply to minimize computing times again.

The whole purpose is to create the X and Y coordinates using normal 
distribution and compute local standard distances. Local by which I mean 
is that a group of observation points are selected by distance 
thresholds of the reference points.

# Normally distributed X-Y Coordinates with hypothetical z values
pts<-500 # Number of observations =n
cases<-10 # Number of variables
x<-rnorm(pts)
y<-rnorm(pts)
z<-matrix(abs(rnorm(pts*cases)),pts,cases)

# Combine x, y, and zs
Ldata<-cbind(x,y,z)    # n*(2+p) matrix p=# of variables 2=X and Y

# Compute the Euclidean distances between points
disE<-data.matrix(dist(cbind(x,y)))

# Create a series of values that act as a threshold
thrsE<-seq(1,max(disE),by=0.5)

# Compute local mean centers and median centers of the nearest neighbors 
within the distance threshold of n reference points
LMNX<-matrix(,pts,length(thrsE))    # local mean X
LMNY<-matrix(,pts,length(thrsE))    # local mean Y
LMDX<-matrix(,pts,length(thrsE))    # local median X
LMDY<-matrix(,pts,length(thrsE))    # local median Y
LSDMN<-rep(list(matrix(,pts,length(thrsE))),cases)

# Then compute standard distances of the Zs of the neighbors within the 
distance thresholds of n reference points
for (j in 1:pts){   
for (k in 1:length(thrsE)){
    LMNX[j,k]<-mean(Ldata[as.vector(which(disE[j,]<=thrsE[k])),1])
    LMNY[j,k]<-mean(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2])
    LMDX[j,k]<-median(Ldata[as.vector(which(disE[j,]<=thrsE[k])),1])
    LMDY[j,k]<-median(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2])
for (l in 1:cases){
    
LSDMN[[l]][j,k]<-sqrt(sum(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2+l]*(Ldata[as.vector(which(disE[j,]<=thrsE[k])),1]-LMNX[j,k])^2+
            
Ldata[as.vector(which(disE[j,]<=thrsE[k])),2+l]*(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2]-LMNY[j,k])^2)/sum(Ldata[as.vector(which(disE[j,]<=thrsE[k])),2+l]))
}}}

I believe I should use lapply or sapply in this loop to minimize 
computing times because my way is to allocate computed values at [j,k] 
of the big matrix.... I have tried using lapply, but I am not sure how I 
can define higher arrays that work with lapply...

many many thanks in advance.

Taka
Indiana University



More information about the R-help mailing list