我在 R 中有以下数据集 - 该数据集包含地理编码地址(加拿大)及其邮政编码、经度和纬度的列表:https ://www.dropbox.com/scl/fi/9kjoqsppb85ip0tdc5wmr/stackoverflow_example.csv?rlkey= cwjk222jnoz8c9cgbt01p6bep&dl=0
我将这些数据加载到 R 中:
library(httr)
library(data.table)
url <- "https://www.dropbox.com/scl/fi/9kjoqsppb85ip0tdc5wmr/stackoverflow_example.csv?rlkey=cwjk222jnoz8c9cgbt01p6bep&dl=1"
response <- GET(url)
df <- fread(content(response, "text"), sep = ",", quote = "", fill = TRUE)
> head(df)
"" "latitude" "longitude" "source_id" "id" "group_id" "street_no" "street" "str_name" "str_type" "str_dir" "unit" "city" "postal_code"
1: "999976" 44.25845 -76.46308 "" "6baaa1692aaaa7b496aa" 2494328 "104" "POINT ST. MARK DR" "POINT ST. MARK" "DR" "" "" "KINGSTON" "K7K 6X8"
2: "999977" 44.26391 -76.45090 "" "1f01e7839e59727d95a3" 2508891 "229" "GREENLEES DR" "GREENLEES" "DR" "" "" "KINGSTON" "K7K 6R1"
3: "999978" 44.26262 -76.45164 "" "289d509f161955a7b5ad" 2519600 "443" "MAUREEN ST" "MAUREEN" "ST" "" "" "KINGSTON" "K7K 7M2"
4: "999979" 44.26545 -76.45230 "" "e6516f9c0e173b684627" 2506548 "198" "GREENLEES DR" "GREENLEES" "DR" "" "" "KINGSTON" "K7K 6P7"
5: "999983" 44.23525 -76.49264 "" "2576c9ff82b86638737c" 2508273 "22" "ELLICE ST" "ELLICE" "ST" "" "" "KINGSTON" "K7K 1M5"
6: "999993" 44.26520 -76.45182 "" "03aaecc65f29e1e04fe1" 2507798 "212" "GREENLEES DR" "GREENLEES" "DR" "" "" "KINGSTON" "K7K 6P7"
"full_addr" "city_pcs" "str_name_pcs" "str_type_pcs" "str_dir_pcs" "csduid" "csdname" "pruid" "provider"
1: "104 POINT ST. MARK DR" "KINGSTON" "PT ST MARK" "DR" "" 3510010 "Kingston" 35 "City of Kingston"
2: "229 GREENLEES DR" "KINGSTON" "GREENLEES" "DR" "" 3510010 "Kingston" 35 "City of Kingston"
3: "443 MAUREEN ST" "KINGSTON" "MAUREEN" "ST" "" 3510010 "Kingston" 35 "City of Kingston"
4: "198 GREENLEES DR" "KINGSTON" "GREENLEES" "DR" "" 3510010 "Kingston" 35 "City of Kingston"
5: "22 ELLICE ST" "KINGSTON" "ELLICE" "ST" "" 3510010 "Kingston" 35 "City of Kingston"
6: "212 GREENLEES DR" "KINGSTON" "GREENLEES" "DR" "" 3510010 "Kingston" 35 "City of Kingston"
Run Code Online (Sandbox Code Playgroud)
我的问题:使用这些数据,我试图“近似”每个邮政编码的地理边界并将结果可视化。
基于我在这里发布的类似问题(R:制作加拿大邮政编码地图),有人向我建议一种称为“多边形裁剪”的技术可能有助于完成此任务(即参见@User提供的答案:Michael)杜瓦瓶)。
我正在尝试了解“多边形裁剪”如何有助于完成此任务 - 但我不确定如何开始使用此任务。最后,我想要一个类似于 shapefile 的东西,其中包含每个邮政编码的“周长”(以经度和纬度坐标表示)。
有人可以告诉我该怎么做吗?
谢谢!
笔记:
I_O*_*_O 5
沃罗诺伊近似:
(地图图块由 Stamen Design 提供,CC BY 3.0 下。数据由 OpenStreetMap 提供,根据 ODbL。)
从您的 httr 响应开始,方法如下:
d(在代码中省略 quote 参数,否则你会得到有趣的列名称): d <- data.table::fread(content(response, "text"), sep = ",", fill = TRUE)
Run Code Online (Sandbox Code Playgroud)
my_proj <- '+proj=aea +lat_1=50 +lat_2=70 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs'
Run Code Online (Sandbox Code Playgroud)
d: library(sf)
library(dplyr)
the_points <-
d |>
select(longitude, latitude, postal_code) |>
st_as_sf(coords = c('longitude', 'latitude'),
crs = 4326 ## EPSG-code for lon/lat coords
) |>
st_transform(my_proj) |> ## transf. to "Canadian" proj.
head(100) ## only the first 100 items for demonstration
Run Code Online (Sandbox Code Playgroud)
the_zones <-
the_points |>
st_union() |> ## !
st_voronoi() |>
st_collection_extract('POLYGON') |>
st_as_sf() |>
## create an ID for later join with postal code:
mutate(id = paste0('zone_', row_number())) |>
rename(geometry = x)
Run Code Online (Sandbox Code Playgroud)
st_intersection使用和获取哪个邮政编码(点数据)属于哪个区域(多边形数据)的查找向量pull: postal_code_lookup <-
st_intersection(the_zones, the_points) |> ## keep only zones around points
distinct() |>
as.data.frame() |>
pull(postal_code, id) ## pulling two cols creates a named vector
the_zones |>
mutate(postal_code = postal_code_lookup[id])
Run Code Online (Sandbox Code Playgroud)
生成的空间数据帧(用于进一步处理):
+ Simple feature collection with 74 features and 2 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 1554420 ymin: 683010.2 xmax: 1569265 ymax: 698094.6
CRS: +proj=aea +lat_1=50 +lat_2=70 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs
First 10 features:
id geometry postal_code
1 zone_1 POLYGON ((1554420 697558.1,... K7K 0C3
2 zone_2 POLYGON ((1560077 688369.5,... K7K 1X7
3 zone_3 POLYGON ((1560089 688314.7,... K7K 4S4
4 zone_4 POLYGON ((1554420 688610.3,... K7K 4W8
## ...
Run Code Online (Sandbox Code Playgroud)
library(gglot2)
library(ggmap)
the_zones <-
the_zones |>
mutate(postal_code = postal_code_lookup[id]) |>
st_transform(4326) ## back to geographic coords
the_background <- get_stamenmap(c(left = -76.46, bottom = 44.25,
right = -76.42, top = 44.27),
zoom = 14
)
ggmap(the_background) +
geom_sf(data = the_zones, alpha = 0, inherit.aes = FALSE) +
geom_sf_text(data = the_zones, aes(label = postal_code), inherit.aes = FALSE) +
geom_sf(data = st_transform(the_points, 4326), inherit.aes = FALSE) +
coord_sf(xlim = c(-76.46, -76.42), ylim = c(44.255, 44.27)) +
theme_minimal()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
212 次 |
| 最近记录: |