Edw*_*rek 6 r dataframe dplyr interleave
如何将2个数据帧中的行交错为完美的riffle shuffle?
示例数据:
df1 <- data.frame(df = 1, id = 1:5, chr = 'puppies')
df2 <- data.frame(df = 2, id = 1:5, chr = 'kitties')
Run Code Online (Sandbox Code Playgroud)
DF1:
df id chr
1 1 1 puppies
2 1 2 puppies
3 1 3 puppies
4 1 4 puppies
5 1 5 puppies
Run Code Online (Sandbox Code Playgroud)
DF2:
df id chr
1 2 1 kitties
2 2 2 kitties
3 2 3 kitties
4 2 4 kitties
5 2 5 kitties
Run Code Online (Sandbox Code Playgroud)
期望的结果:
df id chr
1 1 1 puppies
2 2 1 kitties
3 1 2 puppies
4 2 2 kitties
5 1 3 puppies
6 2 3 kitties
7 1 4 puppies
8 2 4 kitties
9 1 5 puppies
10 2 5 kitties
Run Code Online (Sandbox Code Playgroud)
非dplyr解决方案是使用包中的interleave函数gdata.
gdata::interleave(df1, df2)
Run Code Online (Sandbox Code Playgroud)
为每个数据框独立分配行号,然后绑定行并按行号和数据框 ID 进行排序/排列。在此示例中,行号很简单,因为 id 是连续的并且充当行号。但一般情况下,应该使用行号。
这是使用 dplyr 的示例:
df1 %>%
mutate(row_number = row_number()) %>%
bind_rows(df2 %>% mutate(row_number = row_number())) %>%
arrange(row_number, df)
Run Code Online (Sandbox Code Playgroud)
输出:
df id chr row_number
(dbl) (int) (chr) (int)
1 1 1 puppies 1
2 2 1 kitties 1
3 1 2 puppies 2
4 2 2 kitties 2
5 1 3 puppies 3
6 2 3 kitties 3
7 1 4 puppies 4
8 2 4 kitties 4
9 1 5 puppies 5
10 2 5 kitties 5
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
330 次 |
| 最近记录: |