假设我有一些看起来像这样的计数数据:
library(tidyr)
library(dplyr)
X.raw <- data.frame(
x = as.factor(c("A", "A", "A", "B", "B", "B")),
y = as.factor(c("i", "ii", "ii", "i", "i", "i")),
z = 1:6)
X.raw
# x y z
# 1 A i 1
# 2 A ii 2
# 3 A ii 3
# 4 B i 4
# 5 B i 5
# 6 B i 6
Run Code Online (Sandbox Code Playgroud)
我想像这样整理和总结:
X.tidy <- X.raw %>% group_by(x,y) %>% summarise(count=sum(z))
X.tidy
# Source: local data frame [3 x 3]
# Groups: x …Run Code Online (Sandbox Code Playgroud) 我正在尝试按组汇总数据集,以使用虚拟列来确定每个组的值是否出现在数据的未分组的最常见值中。
作为示例,让我们flights从 中获取数据nycflights13。
library(dplyr, warn.conflicts = FALSE)
library(nycflights13)
my_flights_raw <-
flights %>%
select(carrier, month, dest)
my_flights_raw
#> # A tibble: 336,776 x 3
#> carrier month dest
#> <chr> <int> <chr>
#> 1 UA 1 IAH
#> 2 UA 1 IAH
#> 3 AA 1 MIA
#> 4 B6 1 BQN
#> 5 DL 1 ATL
#> 6 UA 1 ORD
#> 7 B6 1 FLL
#> 8 EV 1 IAD
#> 9 B6 …Run Code Online (Sandbox Code Playgroud) 问题:
与群组的命令data.table相当于什么?tidyrcompleteby
什么是之间的关系on,并by为data.table?
例:
dt=data.table(a = c(1,1,2,2,3,3,4,4) , b = c(4,5,6,7,8,9,10,11) , c = c("x","x","x","x","y","y","y","y"))
show(dt)
a b c
1: 1 4 x
2: 1 5 x
3: 2 6 x
4: 2 7 x
5: 3 8 y
6: 3 9 y
7: 4 10 y
8: 4 11 y
Run Code Online (Sandbox Code Playgroud)
目标是获得以下内容:
a b c
1 4 x
1 5 x
1 6 x
1 7 x
2 …Run Code Online (Sandbox Code Playgroud) 我有一个由X和Y坐标组成的大网格,每个坐标代表一个值。但是,网格内的某些组合不存在,请参见附图:
我想用R脚本识别缺少的xy组合,但是不知道该怎么做。获得这些组合的有效方法是什么?
我的数据示例:
df1 <- structure(list(coord_n = c(1065125L, 1065875L, 1064625L, 1064375L,
1065625L, 1065375L, 1065625L, 1065125L, 1065625L, 1065125L, 1066125L,
1064625L, 1066375L, 1064125L, 1064375L, 1064625L, 1066375L, 1064875L,
1066125L, 1066625L, 1064375L, 1065125L, 1066375L, 1066625L, 1065125L,
1065875L, 1064125L, 1064375L, 1064125L, 1065875L, 1064625L, 1065125L,
1065125L, 1065625L, 1066375L, 1064375L, 1064875L, 1065875L, 1066375L,
1066625L, 1064375L, 1064625L, 1066375L, 1065875L, 1065375L, 1065375L,
1066625L, 1065375L, 1064625L, 1066625L, 1066125L, 1065625L, 1065375L,
1065875L, 1064125L, 1064375L, 1064875L, 1065625L, 1065625L, 1064625L,
1064875L, 1065375L, 1065875L, 1065875L, 1066625L, 1065875L, 1064875L,
1066625L, 1064875L, 1064125L, 1066125L, 1064375L, 1066375L, …Run Code Online (Sandbox Code Playgroud)