[R] Nearest Neighbor Algorithm in R -- again.
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Feb 2 23:29:38 CET 2004
Why not modify the C code of knn? (Not knn1)
On Mon, 2 Feb 2004, Hans W Borchers wrote:
> Several of the methods I use for analyzing large data sets, such as
>
> WinGamma: determining the level of noise in data
> Relief-F: estimating the influence of variables
>
> depend on finding the k nearest neighbors of a point in a data frame or
> matrix efficiently. (For large data sets it is not feasible to compute
> the 'dist' matrix anyway.)
>
> Seeing the proposed solution to "[R] distance between two matrices"
> last month for finding _one_ nearest neighbor I came up with a solution
> 'nearest(A, n, k)' as appended.
>
> Still, this is very clumsy and slow if you have to find the 3 nearest
> neighbors for 1000 points in a data frame with 100000 entries at least
> -- about 2 secs per data point on my computer or half an hour for an
> application from real life.
>
> Are there better/faster ways to perform this task using R functions?
> Even better, is there a free implementation of kd-trees that I could
> utilize (the one I found did not conform to the ANSI C standard)?
>
> Someome pointed to 'spdep' of the R-Sig-Geo project, but 'knearneigh'
> only accepts 2D data points.
>
> Hans Werner Borchers
> ABB Corporate Research, Germany
> ________________________________________________________________________
>
> require (class)
> nearest <- function (X, n, k=3)
> ## Find k nearest neighbors of X[n, ] in the data frame
> ## or matrix X, utilizing function knn1 k-times.
> {
> N <- nrow(X)
> # inds contains the indices of nearest neighbors
> inds <- c(n); i <- 0
> while (i < k) {
> # use knn1 for one index...
> j <- as.integer(knn1(X [-inds, ], X[n, ], 1:(N-length(inds))))
> # ...and change to true index of neighbor
> inds <- c(inds, setdiff(1:N, inds)[j])
> i <- i+1
> }
> # return nearest neighbor indices (without n, of course)
> return(inds[-1])
> }
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list