观察到的行为:从上图可以看出,各国的名称与其实际几何形状不匹配.
预期的行为:我想将数据框与其几何图形正确连接,并在ggmap中显示结果.
我以前加入了不同的数据框架,但事实上显然ggmap需要"强化"(实际上我不知道究竟是什么意思)数据框以显示结果.
这是我到目前为止所做的:
library(rgdal)
library(dplyr)
library(broom)
library(ggmap)
# Load GeoJSON file with countries.
countries = readOGR(dsn = "https://gist.githubusercontent.com/ccamara/fc26d8bb7e777488b446fbaad1e6ea63/raw/a6f69b6c3b4a75b02858e966b9d36c85982cbd32/countries.geojson")
# Load dataframe.
df = read.csv("https://gist.githubusercontent.com/ccamara/fc26d8bb7e777488b446fbaad1e6ea63/raw/a6f69b6c3b4a75b02858e966b9d36c85982cbd32/sample-dataframe.csv")
# Join geometry with dataframe.
countries$iso_a2 = as.factor(countries$iso_a2)
countries@data = left_join(countries@data, df, by = c('iso_a2' = 'country_code'))
# Convert to dataframe so it can be used by ggmap.
countries.t = tidy(countries)
# Here's where the problem starts, as by doing so, data has been lost!
# Recover attributes' table that was destroyed after using broom::tidy.
countries@data$id = rownames(countries@data) # Adding a new id variable.
countries.t = left_join(countries.t, countries@data, by = "id")
ggplot(data = countries.t,
aes(long, lat, fill = country_name, group = group)) +
geom_polygon() +
geom_path(colour="black", lwd=0.05) + # polygon borders
coord_equal() +
ggtitle("Data and geometry have been messed!") +
theme(axis.text = element_blank(), # change the theme options
axis.title = element_blank(), # remove axis titles
axis.ticks = element_blank()) # remove axis ticks
Run Code Online (Sandbox Code Playgroud)
混乱的行为是有原因的。
countries开始时是一个大型 SpatialPolygonsDataFrame,包含177 个元素(相应的 177 行countries@data)。当您执行left_join和countries@data时df, 中 的元素数量countries不受影响,但 中 的行数countries@data增加到210。
countries使用broom::tidyConverts进行强化countries,将其 177 个元素转换为从 0 到 176 运行的数据帧id。(我不确定为什么它是零索引,但我通常更喜欢显式指定区域)。
另一方面,添加id到countries@databased on会导致值从 1 到 210,因为这是之前使用 联接之后的行数。结果,一切都变得不同步。rownames(countries@data)idcountries@datadf
请尝试以下方法:
# (we start out right after loading countries & df)
# no need to join geometry with df first
# convert countries to data frame, specifying the regions explicitly
# (note I'm using the name column rather than the iso_a2 column from countries@data;
# this is because there are some repeat -99 values in iso_a2, and we want
# one-to-one matching.)
countries.t = tidy(countries, region = "name")
# join with the original file's data
countries.t = left_join(countries.t, countries@data, by = c("id" = "name"))
# join with df
countries.t = left_join(countries.t, df, by = c("iso_a2" = "country_code"))
# no change to the plot's code, except for ggtitle
ggplot(data = countries.t,
aes(long, lat, fill = country_name, group = group)) +
geom_polygon() +
geom_path(colour="black", lwd = 0.05) +
coord_equal() +
ggtitle("Data and geometry are fine") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank())
Run Code Online (Sandbox Code Playgroud)
ps 您实际上并不需要 ggmap 包。只是它加载的 ggplot2 包。
| 归档时间: |
|
| 查看次数: |
450 次 |
| 最近记录: |