我有一个名为myTable(输入)的表:
user_name session_num
1 "Joe" 1
2 "Tom" 2
3 "Fred" 1
4 "Tom" 1
5 "Joe" 2
6 "John" 1
Run Code Online (Sandbox Code Playgroud)
我想知道我user_id有多少 只有 session_num = 1(输出):
user_name session_num
1 "Fred" 1
2 "John" 1
Run Code Online (Sandbox Code Playgroud)
这是一个可能的解决方案 data.table
library(data.table)
setDT(df)[, if(all(session_num == 1)) .SD, by = user_name]
# user_name session_num
# 1: Fred 1
# 2: John 1
Run Code Online (Sandbox Code Playgroud)
另一种选择是尝试反连接
df[session_num == 1][!df[session_num != 1], on = "user_name"]
# user_name session_num
# 1: Fred 1
# 2: John 1
Run Code Online (Sandbox Code Playgroud)
一个类似的解决方案dplyr:
library(dplyr)
myTable %>%
group_by(user_name) %>%
filter(all(session_num == 1))
Run Code Online (Sandbox Code Playgroud)
这使:
user_name session_num
(fctr) (int)
1 Fred 1
2 John 1
Run Code Online (Sandbox Code Playgroud)