我有两个数据帧(df和df1).df1是df的子集.我想获得一个数据帧,它是df中df1的补码,即返回第一个数据集的行,这些行在第二个数据集中不匹配.比如让,
数据框df:
heads
row1
row2
row3
row4
row5
Run Code Online (Sandbox Code Playgroud)
数据框df1:
heads
row3
row5
Run Code Online (Sandbox Code Playgroud)
然后所需的输出df2是:
heads
row1
row2
row4
Run Code Online (Sandbox Code Playgroud) tidyr::complete()将行添加到a data.frame中,以获取数据中缺少的列值组合.例:
library(dplyr)
library(tidyr)
df <- data.frame(person = c(1,2,2),
observation_id = c(1,1,2),
value = c(1,1,1))
df %>%
tidyr::complete(person,
observation_id,
fill = list(value=0))
Run Code Online (Sandbox Code Playgroud)
产量
# A tibble: 4 × 3
person observation_id value
<dbl> <dbl> <dbl>
1 1 1 1
2 1 2 0
3 2 1 1
4 2 2 1
Run Code Online (Sandbox Code Playgroud)
其中value组合person == 1和observation_id == 2缺少的组合df已填入值0.
什么相当于这个data.table?
我有以下data.table,我不能使用dput命令的输出来重新创建它:
> ddt
Unit Anything index new
1: A 3.4 1 1
2: A 6.9 2 1
3: A1 1.1 1 2
4: A1 2.2 2 2
5: B 2.0 1 3
6: B 3.0 2 3
>
>
> str(ddt)
Classes ‘data.table’ and 'data.frame': 6 obs. of 4 variables:
$ Unit : Factor w/ 3 levels "A","A1","B": 1 1 2 2 3 3
$ Anything: num 3.4 6.9 1.1 2.2 2 3
$ index : num 1 2 1 …Run Code Online (Sandbox Code Playgroud)