I am trying to filter all rows within a group in a data.table if a max value within that group is > some value. Below is how I would do it in DPLY and how I got it working in two steps in data.table.
#DPLYR
df<-data.table(
x =1:12
,y = 1:3
)
df %>% group_by(y) %>%
filter(max(x) < 11)
##data.table
df[,max_value :=max(x),by=y][max_value<11]
The output should be
x y
1: 1 1
2: 4 1
3: 7 1
4: 10 1
Run Code Online (Sandbox Code Playgroud)
Is there a way to do this in one step without creating the column in my dataset? All that I have been able to find are subsetting a group to get one specific value within a group, not return all row of the group that meet the condition.
We can use .I to get the row index, extract the index column and subset
df[df[, .I[max(x) < 11], y]$V1]
# x y
#1: 1 1
#2: 4 1
#3: 7 1
#4: 10 1
Run Code Online (Sandbox Code Playgroud)
Or another option is .SD
df[, .SD[max(x) < 11], y]
Run Code Online (Sandbox Code Playgroud)