kle*_*52s 2 python geocoding shapefile coordinates
我想生成美国(包括夏威夷和阿拉斯加)的一组随机纬度和经度坐标。我尝试使用国家气象局 ( https://www.weather.gov/gis/USstates )的 shapefile,但它在海洋中央生成点。这样做的最佳方法是什么?我考虑过在美国内陆定义自己的多边形,但这会排除某些州。我\xe2\x80\x99也看到了其他类似的问题,他们使用了美国城市的CSV列表,但我\xe2\x80\x99d宁愿它是完全随机的。
\n这需要geopandas但它是一种快速且标准的解决方案,用于在奇怪的形状内进行采样(称为蒙特卡罗采样)。下面的大多数评论都概述了相同的概念。
# grab shape within which to sample
url = "https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_nation_20m.zip"
us = gpd.read_file(url).explode()
## filter out parts of the US that are far away from mainland, I have no idea what they are (Guam islands?)
us = us.loc[us.geometry.apply(lambda x: x.exterior.bounds[2])<-60]
# grab bounding box within which to generate random numbers
x_min,y_min,x_max,y_max = us.geometry.unary_union.bounds
# the sampling
np.random.seed(2) # set seed (needed for reproducible results
N = 10000
rndn_sample = pd.DataFrame({'x':np.random.uniform(x_min,x_max,N),'y':np.random.uniform(y_min,y_max,N)}) # actual generation
# re-save results in a geodataframe
rndn_sample = gpd.GeoDataFrame(rndn_sample, geometry = gpd.points_from_xy(x=rndn_sample.x, y=rndn_sample.y),crs = us.crs)
# filtering
inUS = rndn_sample['geometry'].apply(lambda s: s.within(us.geometry.unary_union)) # check if within the U.S. bounds
rndn_sample.loc[inUS,:].plot() # plot for visual inspection of results
Run Code Online (Sandbox Code Playgroud)
# grab shapefile of the US from an official source
url = "https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_nation_20m.zip"
us = gpd.read_file(url).explode()
Run Code Online (Sandbox Code Playgroud)
请注意,使用 时explode(),我将多部分多边形展开为单独的行。这可以更轻松地过滤我们感兴趣的区域,因为我们可以获取多部分多边形的每个部分的边界,如下所示。请注意,这-60只是美国大陆最东部(波多黎各)的大致经度。随意减少它以排除 PR
## filter out parts of the US that are far away from mainland, I have no idea what they are (Guam islands?)
us = us.loc[us.geometry.apply(lambda x: x.exterior.bounds[2])<-60]
Run Code Online (Sandbox Code Playgroud)
# grab bounding box within which to generate random numbers
x_min,y_min,x_max,y_max = us.geometry.unary_union.bounds # save min and max x/y coords
Run Code Online (Sandbox Code Playgroud)
注意,unary_union用于将各个行重新组合成单个多部分多边形,bounds用于获取美国过滤子集(即没有关岛)的 x 和 y 坐标的最小/最大
np.random.seed(2) # set seed (needed for reproducible results
N = 10000
rndn_sample = pd.DataFrame({'x':np.random.uniform(x_min,x_max,N),'y':np.random.uniform(y_min,y_max,N)}) # actual generation
# re-save results in a geodataframe
rndn_sample = gpd.GeoDataFrame(rndn_sample, geometry = gpd.points_from_xy(x=rndn_sample.x, y=rndn_sample.y),crs = us.crs)
Run Code Online (Sandbox Code Playgroud)
inUS = rndn_sample['geometry'].apply(lambda s: s.within(us.geometry.unary_union)) # check if within the U.S. bounds
rndn_sample.loc[inUS,:].plot() # plot for visual inspection of results
Run Code Online (Sandbox Code Playgroud)
顺便说一句,这里是所需的库,以防不明确
# load libraries
import pandas as pd
import geopandas as gpd
import numpy as np
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
897 次 |
| 最近记录: |