cin*_*n21 5 python pandas scikit-learn geopandas
在这个问题中,我提到了这个项目:
\n\n\n\n\nhttps://automating-gis-processes.github.io/site/master/notebooks/L3/nearest-neighbor-faster.html
\n
我们有两个 GeoDataFrame:
\n\n建筑物:
\n\n name geometry\n0 None POINT (24.85584 60.20727)\n1 Uimastadion POINT (24.93045 60.18882)\n2 None POINT (24.95113 60.16994)\n3 Hartwall Arena POINT (24.92918 60.20570)\nRun Code Online (Sandbox Code Playgroud)\n\n和巴士站:
\n\n stop_name stop_lat stop_lon stop_id geometry\n0 Ritarihuone 60.169460 24.956670 1010102 POINT (24.95667 60.16946)\n1 Kirkkokatu 60.171270 24.956570 1010103 POINT (24.95657 60.17127)\n2 Kirkkokatu 60.170293 24.956721 1010104 POINT (24.95672 60.17029)\n3 Vironkatu 60.172580 24.956554 1010105 POINT (24.95655 60.17258)\nRun Code Online (Sandbox Code Playgroud)\n\n申请后
\n\n\n\n\nsklearn.neighbors 导入 BallTree
\n
from sklearn.neighbors import BallTree\nimport numpy as np\n\ndef get_nearest(src_points, candidates, k_neighbors=1):\n """Find nearest neighbors for all source points from a set of candidate points"""\n\n # Create tree from the candidate points\n tree = BallTree(candidates, leaf_size=15, metric=\'haversine\')\n\n # Find closest points and distances\n distances, indices = tree.query(src_points, k=k_neighbors)\n\n # Transpose to get distances and indices into arrays\n distances = distances.transpose()\n indices = indices.transpose()\n\n # Get closest indices and distances (i.e. array at index 0)\n # note: for the second closest points, you would take index 1, etc.\n closest = indices[0]\n closest_dist = distances[0]\n\n # Return indices and distances\n return (closest, closest_dist)\n\n\ndef nearest_neighbor(left_gdf, right_gdf, return_dist=False):\n """\n For each point in left_gdf, find closest point in right GeoDataFrame and return them.\n\n NOTICE: Assumes that the input Points are in WGS84 projection (lat/lon).\n """\n\n left_geom_col = left_gdf.geometry.name\n right_geom_col = right_gdf.geometry.name\n\n # Ensure that index in right gdf is formed of sequential numbers\n right = right_gdf.copy().reset_index(drop=True)\n\n # Parse coordinates from points and insert them into a numpy array as RADIANS\n left_radians = np.array(left_gdf[left_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())\n right_radians = np.array(right[right_geom_col].apply(lambda geom: (geom.x * np.pi / 180, geom.y * np.pi / 180)).to_list())\n\n # Find the nearest points\n # -----------------------\n # closest ==> index in right_gdf that corresponds to the closest point\n # dist ==> distance between the nearest neighbors (in meters)\n\n closest, dist = get_nearest(src_points=left_radians, candidates=right_radians)\n\n # Return points from right GeoDataFrame that are closest to points in left GeoDataFrame\n closest_points = right.loc[closest]\n\n # Ensure that the index corresponds the one in left_gdf\n closest_points = closest_points.reset_index(drop=True)\n\n # Add distance if requested\n if return_dist:\n # Convert to meters from radians\n earth_radius = 6371000 # meters\n closest_points[\'distance\'] = dist * earth_radius\n\n return closest_points\n\n\nclosest_stops = nearest_neighbor(buildings, stops, return_dist=True)\nRun Code Online (Sandbox Code Playgroud)\n\n我们为每个建筑物索引获取到最近公交车站的距离:
\n\n stop_name stop_lat stop_lon stop_id geometry distance\n0 Muusantori 60.207490 24.857450 1304138 POINT (24.85745 60.20749) 180.521584\n1 El\xc3\xa4intarha 60.192490 24.930840 1171120 POINT (24.93084 60.19249) 372.665221\n2 Senaatintori 60.169010 24.950460 1020450 POINT (24.95046 60.16901) 119.425777\n3 Veturitie 60.206610 24.929680 1174112 POINT (24.92968 60.20661) 106.762619\nRun Code Online (Sandbox Code Playgroud)\n\n我正在寻找解决方案,让每栋建筑的每个公交车站(可以不止一个)距离都在 250 米以下。
\n\n谢谢你的帮助。
\n这是一种重用 BallTree 所做的事情的方法,就像所讨论的query_radius那样。而且它不是函数格式,但您仍然可以轻松更改它
from sklearn.neighbors import BallTree
import numpy as np
import pandas as pd
## here I start with buildings and stops as loaded in the link provided
# variable in meter you can change
radius_max = 250 # meters
# another parameter, in case you want to do with Mars radius ^^
earth_radius = 6371000 # meters
# similar to the method with apply in the tutorial
# to create left_radians and right_radians, but faster
candidates = np.vstack([stops['geometry'].x.to_numpy(),
stops['geometry'].y.to_numpy()]).T*np.pi/180
src_points = np.vstack([buildings['geometry'].x.to_numpy(),
buildings['geometry'].y.to_numpy()]).T*np.pi/180
# Create tree from the candidate points
tree = BallTree(candidates, leaf_size=15, metric='haversine')
# use query_radius instead
ind_radius, dist_radius = tree.query_radius(src_points,
r=radius_max/earth_radius,
return_distance=True)
Run Code Online (Sandbox Code Playgroud)
现在您可以操纵结果以获得您想要的结果
# create a dataframe build with
# index based on row position of the building in buildings
# column row_stop is the row position of the stop
# dist is the distance
closest_dist = pd.concat([pd.Series(ind_radius).explode().rename('row_stop'),
pd.Series(dist_radius).explode().rename('dist')*earth_radius],
axis=1)
print (closest_dist.head())
# row_stop dist
#0 1131 180.522
#1 NaN NaN
#2 64 174.744
#2 61 119.426
#3 532 106.763
# merge the dataframe created above with the original data stops
# to get names, id, ... note: the index must be reset as in closest_dist
# it is position based
closest_stop = closest_dist.merge(stops.reset_index(drop=True),
left_on='row_stop', right_index=True, how='left')
print (closest_stop.head())
# row_stop dist stop_name stop_lat stop_lon stop_id \
#0 1131 180.522 Muusantori 60.20749 24.85745 1304138.0
#1 NaN NaN NaN NaN NaN NaN
#2 64 174.744 Senaatintori 60.16896 24.94983 1020455.0
#2 61 119.426 Senaatintori 60.16901 24.95046 1020450.0
#3 532 106.763 Veturitie 60.20661 24.92968 1174112.0
#
# geometry
#0 POINT (24.85745 60.20749)
#1 None
#2 POINT (24.94983 60.16896)
#2 POINT (24.95046 60.16901)
#3 POINT (24.92968 60.20661)
Run Code Online (Sandbox Code Playgroud)
最后回到建筑物
# join buildings with reset_index with
# closest_stop as index in closest_stop are position based
final_df = buildings.reset_index(drop=True).join(closest_stop, rsuffix='_stop')
print (final_df.head(10))
# name geometry row_stop dist stop_name \
# 0 None POINT (24.85584 60.20727) 1131 180.522 Muusantori
# 1 Uimastadion POINT (24.93045 60.18882) NaN NaN NaN
# 2 None POINT (24.95113 60.16994) 64 174.744 Senaatintori
# 2 None POINT (24.95113 60.16994) 61 119.426 Senaatintori
# 3 Hartwall Arena POINT (24.92918 60.20570) 532 106.763 Veturitie
# stop_lat stop_lon stop_id geometry_stop
# 0 60.20749 24.85745 1304138.0 POINT (24.85745 60.20749)
# 1 NaN NaN NaN None
# 2 60.16896 24.94983 1020455.0 POINT (24.94983 60.16896)
# 2 60.16901 24.95046 1020450.0 POINT (24.95046 60.16901)
# 3 60.20661 24.92968 1174112.0 POINT (24.92968 60.20661)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1199 次 |
| 最近记录: |