用于计算包含xy坐标列表的数据帧中的所有点之间的距离的方法

unk*_*own 4 math for-loop r apply

我确定之前已经回答了这个问题,但我无法找到生命中的线索!

我试图使用r来生成数据帧中xy坐标对之间所有距离的列表.数据存储如下:

ID = c('1','2','3','4','5','6','7')
x = c(1,2,4,5,1,3,1)
y = c(3,5,6,3,1,5,1)
df= data.frame(ID,x,y)
Run Code Online (Sandbox Code Playgroud)

目前我可以使用以下方法计算两点之间的距离:

length = sqrt((x1 - x2)^2+(y1 - y2)^2).
Run Code Online (Sandbox Code Playgroud)

但是,我不确定下一步该去哪里.我应该使用plyr或for循环中的东西吗?

谢谢你的帮助!

小智 13

你试过吗?dist,你列出的公式是欧几里德距离

dist(df[,-1]) 
Run Code Online (Sandbox Code Playgroud)


Jim*_*ach 6

You can use a self-join to get all combinations then apply your distance formula. All of this is easily do-able using the tidyverse (combination of packages from Hadley Wickham):

# Load the tidyverse
library(tidyverse)

# Set up a fake key to join on (just a constant)
df <- df %>% mutate(k = 1) 

# Perform the join, remove the key, then create the distance
df %>% 
 full_join(df, by = "k") %>% 
 mutate(dist = sqrt((x.x - x.y)^2 + (y.x - y.y)^2)) %>%
 select(-k)
Run Code Online (Sandbox Code Playgroud)

N.B. using this method, you'll also calculate the distance between each point and itself (as well as with all other points). It's easy to filter those points out though:

df %>% 
 full_join(df, by = "k") %>% 
 filter(ID.x != ID.y) %>%
 mutate(dist = sqrt((x.x - x.y)^2 + (y.x - y.y)^2)) %>%
 select(-k)
Run Code Online (Sandbox Code Playgroud)

For more information about using the tidyverse set of packages I'd recommend R for Data Science or the tidyverse website.

  • 对于未来看到这个的任何人。如果您需要删除重复项(例如 1 到 2 和 2 到 1 将给出相同的距离),那么这里有一段(非常难看,但功能齐全)代码,您可以使用:'IDx = df$ID.x IDy = df$ID.y 长度 = df$length df &lt;- data.frame(IDx,IDy,length) df &lt;- data.frame(t(apply(df, 1, sort))) df &lt;- unique(df ) IDx = df$X2 IDy = df$X3 长度 = as.numeric(paste(df$X1)) df1 &lt;- data.frame(IDx,IDy,length)' (3认同)