Remove groups by condition

Question

Remove groups by condition

Suppose I have the following dataframe

using DataFrames
df = DataFrame(A = 1:10, B = ["a","a","b","b","b","c","c","c","c","d"])
grouped_df  = groupby(df, "B")

Run Code Online (Sandbox Code Playgroud)

I would have four groups. How can I drop the groups that have fewer than, say, 2 rows? For example, how can I keep only groups a,b, and c? I can easily do it with a for loop, but I don't think the optimal way.

Answer 1

Bog*_*ski 4

如果您希望结果仍然分组，那么filter最简单：

\n

julia> filter(x -> nrow(x) > 1, grouped_df)\nGroupedDataFrame with 3 groups based on key: B\nFirst Group (2 rows): B = "a"\n Row \xe2\x94\x82 A      B\n     \xe2\x94\x82 Int64  String\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\xbc\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\n   1 \xe2\x94\x82     1  a\n   2 \xe2\x94\x82     2  a\n\xe2\x8b\xae\nLast Group (4 rows): B = "c"\n Row \xe2\x94\x82 A      B\n     \xe2\x94\x82 Int64  String\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\xbc\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\n   1 \xe2\x94\x82     6  c\n   2 \xe2\x94\x82     7  c\n   3 \xe2\x94\x82     8  c\n   4 \xe2\x94\x82     9  c\n

Run Code Online (Sandbox Code Playgroud)\n

如果您想通过一项操作获取数据帧，请执行以下操作：

\n

julia> combine(grouped_df, x -> nrow(x) < 2 ? DataFrame() : x)\n9\xc3\x972 DataFrame\n Row \xe2\x94\x82 B       A\n     \xe2\x94\x82 String  Int64\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\xbc\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\n   1 \xe2\x94\x82 a           1\n   2 \xe2\x94\x82 a           2\n   3 \xe2\x94\x82 b           3\n   4 \xe2\x94\x82 b           4\n   5 \xe2\x94\x82 b           5\n   6 \xe2\x94\x82 c           6\n   7 \xe2\x94\x82 c           7\n   8 \xe2\x94\x82 c           8\n   9 \xe2\x94\x82 c           9\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	4 年，9 月前
查看次数：	340 次
最近记录：	2 年，9 月前