如何使用ggmap正确连接数据和几何

cca*_*ara 6 r ggmap

一张图片胜过千言万语: 数据与几何不匹配

观察到的行为:从上图可以看出,各国的名称与其实际几何形状不匹配.

预期的行为:我想将数据框与其几何图形正确连接,并在ggmap中显示结果.

我以前加入了不同的数据框架,但事实上显然ggmap需要"强化"(实际上我不知道究竟是什么意思)数据框以显示结果.

这是我到目前为止所做的:

library(rgdal)
library(dplyr)
library(broom)
library(ggmap)

# Load GeoJSON file with countries.
countries = readOGR(dsn = "https://gist.githubusercontent.com/ccamara/fc26d8bb7e777488b446fbaad1e6ea63/raw/a6f69b6c3b4a75b02858e966b9d36c85982cbd32/countries.geojson")

# Load dataframe.
df = read.csv("https://gist.githubusercontent.com/ccamara/fc26d8bb7e777488b446fbaad1e6ea63/raw/a6f69b6c3b4a75b02858e966b9d36c85982cbd32/sample-dataframe.csv")

# Join geometry with dataframe.
countries$iso_a2 = as.factor(countries$iso_a2)
countries@data = left_join(countries@data, df, by = c('iso_a2' = 'country_code'))

# Convert to dataframe so it can be used by ggmap.
countries.t = tidy(countries)

# Here's where the problem starts, as by doing so, data has been lost!

# Recover attributes' table that was destroyed after using broom::tidy.
countries@data$id = rownames(countries@data) # Adding a new id variable.
countries.t = left_join(countries.t, countries@data, by = "id")

ggplot(data = countries.t,
       aes(long, lat, fill = country_name, group = group)) +
  geom_polygon() +
  geom_path(colour="black", lwd=0.05) + # polygon borders
  coord_equal() +
  ggtitle("Data and geometry have been messed!") +
  theme(axis.text = element_blank(), # change the theme options
        axis.title = element_blank(), # remove axis titles
        axis.ticks = element_blank()) # remove axis ticks
Run Code Online (Sandbox Code Playgroud)

Z.L*_*Lin 1

混乱的行为是有原因的。

countries开始时是一个大型 SpatialPolygonsDataFrame,包含177 个元素(相应的 177 行countries@data)。当您执行left_joincountries@datadf, 中 的元素数量countries不受影响,但 中 的行数countries@data增加到210

countries使用broom::tidyConverts进行强化countries,将其 177 个元素转换为从 0 到 176 运行的数据帧id。(我不确定为什么它是零索引,但我通常更喜欢显式指定区域)。

另一方面,添加idcountries@databased on会导致值从 1 到 210,因为这是之前使用 联接之后的行数。结果,一切都变得不同步。rownames(countries@data)idcountries@datadf

请尝试以下方法:

# (we start out right after loading countries & df)

# no need to join geometry with df first

# convert countries to data frame, specifying the regions explicitly
# (note I'm using the name column rather than the iso_a2 column from countries@data;
# this is because there are some repeat -99 values in iso_a2, and we want
# one-to-one matching.)
countries.t = tidy(countries, region = "name")

# join with the original file's data
countries.t = left_join(countries.t, countries@data, by = c("id" = "name"))

# join with df
countries.t = left_join(countries.t, df, by = c("iso_a2" = "country_code"))

# no change to the plot's code, except for ggtitle
ggplot(data = countries.t,
       aes(long, lat, fill = country_name, group = group)) +
  geom_polygon() +
  geom_path(colour="black", lwd = 0.05) +
  coord_equal() +
  ggtitle("Data and geometry are fine") +
  theme(axis.text = element_blank(),
        axis.title = element_blank(),
        axis.ticks = element_blank())
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

ps 您实际上并不需要 ggmap 包。只是它加载的 ggplot2 包。