Bar*_*art 5 r dplyr nse tidyeval
对于某些对象,属性标识特殊列,例如对象中的几何列sf。为了在其中进行一些计算,dplyr最好能够轻松识别这些列。我正在寻找一种方法来创建一个有助于识别此列的函数。在下面的示例中,我可以创建一个函数来标识该列,但我仍然需要使用rlang拼接运算符 ( !!!)。
require(sf)\nrequire(dplyr)\nn<-4\ndf = st_as_sf(data.frame(x = 1:n, y = 1:n, cat=gl(2,2)), coords = 1:2, crs = 3857) %>% group_by(cat)\n# this is the example I start from however the geometry column is not guaranteed to have that name\ndf %>% mutate(d=st_distance(geometry, geometry[row_number()==1]))\n#> Simple feature collection with 4 features and 2 fields\n#> Geometry type: POINT\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 4\n#> Projected CRS: WGS 84 / Pseudo-Mercator\n#> # A tibble: 4 \xc3\x97 3\n#> # Groups: cat [2]\n#> cat geometry d[,1]\n#> * <fct> <POINT [m]> [m]\n#> 1 1 (1 1) 0 \n#> 2 1 (2 2) 1.41\n#> 3 2 (3 3) 0 \n#> 4 2 (4 4) 1.41\n# this works, however the code does not get easier to read\ndf %>% mutate(d=st_distance(!!!syms(attr(., "sf_column")), (!!!syms(attr(., "sf_column")))[row_number()==1]))\n#> Simple feature collection with 4 features and 2 fields\n#> ...\n#> 4 2 (4 4) 1.41\n# this works and is already better:\ngeometry_name<-function(x) syms(attr(x, \'sf_column\'))\ndf %>% mutate(d=st_distance(!!!geometry_name(.), (!!!geometry_name(.))[row_number()==1]))\n#> Simple feature collection with 4 features and 2 fields\n#> ... \n#> 4 2 (4 4) 1.41\nRun Code Online (Sandbox Code Playgroud)\n理想情况下,我想找到一个使以下代码工作的函数,因为这对用户来说最简单:
\ndf %>% mutate(d=st_distance(geometry_name(), geometry_name()[row_number()==1]))\nRun Code Online (Sandbox Code Playgroud)\n
调用这种不带参数的函数要求您假设调用框架中存在符号(在本例中为占位.符和.data代词),因此它在动词之外无法正常工作dplyr,但如果这适合您的工作流程,那么您可以做:
geometry_name <- function() {
.data <- eval(quote(.data), parent.frame())
nms <- names(eval(quote(.), parent.frame()))
geo <- which(sapply(nms, function(x) inherits(.data[[x]], 'sfc')))
if(length(geo) == 0) {
stop('No geometry column detected')
}
if(length(geo) > 1) {
warning('More than one geometry column. Only the first will be used.')
geo <- geo[1]
}
.data[[nms[geo]]]
}
Run Code Online (Sandbox Code Playgroud)
使用您的示例,这允许您使用指定的语法:
df %>%
mutate(d = st_distance(geometry_name(), geometry_name()[row_number()==1]))
#> Simple feature collection with 4 features and 2 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 4
#> Projected CRS: WGS 84 / Pseudo-Mercator
#> # A tibble: 4 x 3
#> # Groups: cat [2]
#> cat geometry d[,1]
#> * <fct> <POINT [m]> [m]
#> 1 1 (1 1) 0
#> 2 1 (2 2) 1.41
#> 3 2 (3 3) 0
#> 4 2 (4 4) 1.41
Run Code Online (Sandbox Code Playgroud)
您可以通过允许该函数接受一个data参数来使该函数更有用,该参数如果运行上面的代码(在检查和 的missing存在之后),但否则只是查找并返回来自 的列。这将允许在动词之外使用,但保留内部所需的行为。..datasfdatadplyrdplyr
例如:
geometry_name <- function(data) {
if(missing(data)) {
.data <- tryCatch( {
eval(quote(.data), parent.frame())
}, error = function(e){
stop("Argument 'data' missing, with no default")
})
plchlder <- tryCatch({
eval(quote(.), parent.frame())
}, error = function(e) {
stop("geometry_name can only be used without a 'data' argument ",
"inside dplyr verbs")
})
nms <- names(plchlder)
geo <- which(sapply(nms, function(x) inherits(.data[[x]], 'sfc')))
if(length(geo) == 0) {
stop('No geometry column detected')
}
if(length(geo) > 1) {
warning('More than one geometry column. Only the first will be used.')
geo <- geo[1]
}
return(.data[[nms[geo]]])
}
geo <- which(sapply(data, function(x) inherits(x, 'sfc')))
if(length(geo) == 0) stop('No geometry column detected')
if(length(geo) > 1) {
warning('More than one geometry column. Only the first will be used.')
geo <- geo[1]
}
return(data[[geo]])
}
Run Code Online (Sandbox Code Playgroud)
这给出了以下行为
geometry_name(df)
#> [1] "geometry"
geometry_name()
#> Error in value[[3L]](cond) :
#> geometry_name can only be used without a 'data' argument inside
#> dplyr verbs
df %>%
mutate(d = st_distance(geometry_name(), geometry_name()[row_number()==1]))
#> Simple feature collection with 4 features and 2 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 4
#> Projected CRS: WGS 84 / Pseudo-Mercator
#> # A tibble: 4 x 3
#> # Groups: cat [2]
#> cat geometry d[,1]
#> * <fct> <POINT [m]> [m]
#> 1 1 (1 1) 0
#> 2 1 (2 2) 1.41
#> 3 2 (3 3) 0
#> 4 2 (4 4) 1.41
Run Code Online (Sandbox Code Playgroud)