小编gkb*_*aby的帖子

Polars 相当于 pandas 表达式 df.groupby['col1','col2']['col3'].sum().unstack()

pandasdf=pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "fruits": ["banana", "banana", "apple", "apple", "banana"],
        "B": [5, 4, 3, 2, 1],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
        "optional": [28, 300, None, 2, -30],
    }
)
pandasdf.groupby(["fruits","cars"])['B'].sum().unstack()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

如何在极坐标中创建等效的真值表?

类似于下表的真值表

df=pl.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "fruits": ["banana", "banana", "apple", "apple", "banana"],
        "B": [5, 4, 3, 2, 1],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
        "optional": [28, 300, None, 2, -30],
    }
)
df.groupby(["fruits","cars"]).agg(pl.col('B').sum()) #->truthtable
Run Code Online (Sandbox Code Playgroud)

代码的效率很重要,因为数据集太大(与 apriori 算法一起使用)

Polars 中的 unstack 函数是不同的,pd.crosstab …

python analytics data-analysis dataframe python-polars

1
推荐指数
1
解决办法
962
查看次数