Mar*_*ler 6 r polygon heatmap kernel-density density-plot
我有欧胡岛海岸附近的点数据。其他人使用这些相同的数据创建了一个大的polygon. 我相信他首先创建了heatmap一个quartic (biweight) kernel,每个点周围半径为 1 公里,像素大小可能为 1 平方公里。他引用了 Silverman(1986 年,第 76 页,方程 4.5,我认为它指的是“统计和数据分析的密度估计”一书)。我相信他将他heatmap的polygon. 我正在尝试polygon使用R和用假数据来近似他Windows 10。我可以使用包中的kde函数来接近ks(见下图)。但该软件包仅包含Gaussian kernels. 是否可以polygon使用 a创建类似的quartic kernel?
另一个分析实际上创建了两个版本的polygon. 一个边界被标记为“> 1 每公里密度”;另一个边界被标记为“> 0.5 每公里密度”。我不知道他是否使用R,QGIS,ArcGIS或别的东西。我无法创建一个大polygon的QGIS,也没有ArcGIS.
感谢您对如何创建任何建议,polygon类似所示的一个,但使用quartic kernel的替代Gaussian kernel。如果我能提供更多信息,请告诉我。
这是我的虚假数据的链接CSV和QGIS格式:在此处输入链接描述 (编辑:希望现在任何人都可以访问虚假数据。我以前可以,但我想其他人不能。)
1. fake_points_oahu.csv
a. raw data
2. fake_points_oahu_utm (.shp, .dbf, .prj, .shx)
a. vector point layer
3. fake_points_oahu_June11_2021.png
a. the figure shown above
Run Code Online (Sandbox Code Playgroud)
这是我的R代码:
setwd('C:/Users/mark_/Documents/ctmm/density_in_R/density_files_for_StackOverflow/')
library(sf) # to read shapefile
library(ks) # to use kde function
my.data <- read.csv("fake_points_oahu.csv", header = TRUE, stringsAsFactors = FALSE, na.strings = "NA")
head(my.data)
# Import shapefile
st_layers("fake_points_oahu_utm.shp")
points_utm <- st_read(dsn = "fake_points_oahu_utm.shp", layer = 'fake_points_oahu_utm')
st_crs(points_utm)
plot(points_utm)
my.matrix <- as.matrix(my.data[,2:3])
head(my.matrix)
# This uses the Guassian kernel
my_gps_hpi <- Hpi(x = my.matrix, pilot = "samse", pre = "scale")
my.fhat <- kde(x = my.matrix, compute.cont = TRUE, h = my_gps_hpi,
xmin = c(min(my.data$longitude), min(my.data$latitude)),
xmax = c(max(my.data$longitude), max(my.data$latitude)),
bgridsize = c(500, 500))
my.contours <- c(96.5)
contourLevels(my.fhat, cont = my.contours)
contourSizes(my.fhat, cont = my.contours, approx = TRUE)
plot(my.data$longitude, my.data$latitude)
plot(my.fhat, lwd = 3, display = "filled.contour", cont = my.contours, add = TRUE)
png(file="fake_points_oahu_June11_2021.png")
plot(my.data$longitude, my.data$latitude)
plot(my.fhat, lwd = 3, display = "filled.contour", cont = my.contours, add = TRUE)
dev.off()
Run Code Online (Sandbox Code Playgroud)
您可以通过稍微修改包中的 kde2d 函数来进行估计MASS。据我所知,目前 R 中没有包可以针对双变量情况使用四次(双权)内核实现双变量 KDE 估计。
单变量双权核可以通过多种方式扩展到多变量核,最简单的是仅使用乘积核,其中您对每个维度使用单变量核,然后将结果相乘。您可以在此处找到双权乘积内核的数学表达式。当您将此内核合并到包kde2d中的密度估计器中时MASS,它看起来如下
kde_biweight_kernel <- function(x,y, bw_x, bw_y, xrange, yrange){
# This function is based on the kde2d function from
# the MASS package. The only difference is that the Gaussian
# kernel is substituted with a biweight product kernel
# product kernel:
biweight_kernel <- function(u){
mask = abs(u) > 1
kernel_val = (15/16)*((1-u^2)^2)
kernel_val[mask] = 0
return(kernel_val)
}
lims = c(xrange, yrange)
n = 500
nx <- length(x)
n <- rep(n, length.out = 2L)
# get grid on which we want to estimate the density
gx <- seq.int(lims[1L], lims[2L], length.out = n[1L])
gy <- seq.int(lims[3L], lims[4L], length.out = n[2L])
# inputs to kernel
ax <- outer(gx, x, "-" )/bw_x
ay <- outer(gy, y, "-" )/bw_y
# evaluate and multiply kernel results along both axes
res = tcrossprod(biweight_kernel(ax), biweight_kernel(ay))/(nx * bw_x * bw_y)
return(list(x = gx, y = gy, z = res))
}
Run Code Online (Sandbox Code Playgroud)
使用该kde_biweight_kernel函数,您可以计算所需的密度,如下所示
library(MASS)
library(birk)
library(kedd)
library(sf)
library(ks)
# load data
my.data <- read.csv("fake_points_oahu.csv", header = TRUE, stringsAsFactors = FALSE, na.strings = "NA")
# Import shapefile
st_layers("fake_points_oahu_utm.shp")
points_utm <- st_read(dsn = "fake_points_oahu_utm.shp", layer = 'fake_points_oahu_utm')
x = my.data$longitude
y = my.data$latitude
# determine bandwidth for biweight kernel along both axes
bw_x = h.amise(x, deriv.order = 0, kernel = "biweight")$h
bw_y = h.amise(y, deriv.order = 0, kernel = "biweight")$h
# get ranges in which you want to estimate density
xrange = c(min(my.data$longitude), max(my.data$longitude))
yrange = c(min(my.data$latitude), max(my.data$latitude))
# get 2d density estimate with quartic (biweight) kernel
result = kde_biweight_kernel(x,y, bw_x, bw_y, xrange, yrange)
Run Code Online (Sandbox Code Playgroud)
请注意,带宽是专门针对双权重内核情况计算的。生成的密度对象与实际对象有点不同ks::kde。例如,它还没有轮廓级别。kde2dQuantile我们可以通过使用rmngb包中函数的稍微修改版本计算分位数来获得轮廓水平
# get quantiles of interest:
kde2dQuantile <- function(d, X, Y, probs = .05) {
xInd <- sapply(X, function(x) which.closest(d$x, x))
yInd <- sapply(Y, function(x) which.closest(d$y, x))
zValues <- d$z[cbind(xInd, yInd)]
quantile(zValues, probs=probs)
}
# get quantiles
quantiles = kde2dQuantile(result, x, y, seq(0,1,by=0.001))
Run Code Online (Sandbox Code Playgroud)
从你的问题来看,我不确定你对哪个分位数感兴趣,所以我只选择了 1% 分位数。为了能够以与问题中相同的方式绘制数据,我们必须以与类中的对象相同的方式格式化密度结果kde:
# to make the kde estimate compatible with the other density estimates
# from the ks package, the result can be converted to a named list.
# -> create ks::KDE object:
axes = matrix(c(result$x,result$y), ncol = 2)
colnames(axes) = c('longitude', 'latitude')
my.fhat_biweight = list('x' = axes,
'eval.points' = list(result$x, result$y),
'estimate' = result['z']$z,
'gridtype' = 'linear', 'gridded' = TRUE,
'binned' = TRUE, 'names' = c("longitude","latitude" ))
# add quantile to ks::KDE object
my.fhat_biweight$cont = quantiles
# change class (make sure ks package is loaded for this)
class(my.fhat_biweight) <- "kde"
Run Code Online (Sandbox Code Playgroud)
最后绘制数据上的双权核密度
plot(my.data$longitude, my.data$latitude)
plot(my.fhat_biweight, lwd = 3, display = "filled.contour", cont = cont=c(96.5), add = TRUE)
Run Code Online (Sandbox Code Playgroud)
这输出: