r - Closest point to a path -

- September 15, 2012

i have 2 sets of points, called path , centers. each point in path, efficient method finding id of closest point in centers. in r. below simple reproducible example.

set.seed(1) n <- 10000 x <- 100*cumprod(1 + rnorm(n, 0.0001, 0.002)) y <- 50*cumprod(1 + rnorm(n, 0.0001, 0.002))  path <- data.frame(cbind(x=x, y=y))  centers <- expand.grid(x=seq(0, 500,by=0.5) + rnorm(1001),                         y=seq(0, 500, by=0.2) + rnorm(2501))  centers$id <- seq(nrow(centers))

x , y coordinates. add column path data.frame has id of closest center given x , y co-ordinate. want of unique ids.

my solution @ moment work, slow when scale of problem increases. more efficient.

path$closest.id <- sapply(seq(nrow(path)), function(z){    tmp <- ((centers$x - path[z, 'x'])^2) + ((centers$y - path[z, 'y'])^2)    as.numeric(centers[tmp == min(tmp), 'id']) })  output <- unique(path$closest.id)

any on speeding appreciated.

i think data.table might help, ideally looking algorithm perhaps bit smarter in terms of search, i.e. instead of calculating distances each center , selecting minimum one... id...

i happy use rcpp/rcpp11 if improve performance.

my minimum acceptable time perform kind of calculation out 10 seconds, faster better.

you can nn2 rann package. on system, computes nearest center each of path points in under 2 seconds.

library(rann) system.time(closest <- nn2(centers[, 1:2], path, 1))  #   user  system elapsed  #   1.41    0.14    1.55     sapply(closest, head)  #      nn.idx   nn.dists # [1,] 247451 0.20334929 # [2,] 250454 0.12326323 # [3,] 250454 0.28540127 # [4,] 253457 0.05178687 # [5,] 253457 0.13324137 # [6,] 253457 0.09009626

here's example 2.5 million candidate points fall within extent of path points (in example, centers have larger x , y range path points). it's little slower in case.

set.seed(1) centers2 <- cbind(runif(2.5e6, min(x), max(x)), runif(2.5e6, min(y), max(y))) system.time(closest2 <- nn2(centers2, path, 1))  #   user  system elapsed  #   2.96    0.11    3.07   sapply(closest2, head)  #       nn.idx    nn.dists # [1,]  730127 0.025803703 # [2,]  375514 0.025999069 # [3,] 2443707 0.047259283 # [4,]   62780 0.022747930 # [5,] 1431847 0.002482623 # [6,] 2199405 0.028815865

this can compared output using sp::spdistsn1 (which slower problem):

library(sp) apply(head(path), 1, function(x) which.min(spdistsn1(centers, x)))  #       1       2       3       4       5       6  #  730127  375514 2443707   62780 1431847 2199405

adding point id path data.frame , reducing unique values trivial:

path$closest.id <- closest$nn.idx output <- unique(path$closest.id)

Search This Blog

Crty

r - Closest point to a path -

Comments

Post a Comment

Popular posts from this blog

c# - MSAA finds controls UI Automation doesn't -

python - mat is not a numerical tuple : openCV error -

wordpress - .htaccess: RewriteRule: bad flag delimiters -