"转置"熊猫系列

Question

"转置"熊猫系列

我有一个带有ID列和一些功能列的DataFrame.我想看一下每列值有多少唯一ID的说明.

以下代码有效,但我想知道是否有比to_frame().unstack().unstack()将.describe()系列结果转置到DataFrame 的行更好的方法,其中列是百分位数,最大值,最小值...

def unique_ids(df):
    rows = []
    for col in sorted(c for c in df.columns if c != id_col):
        v = df.groupby(col)[id_col].nunique().describe()
        v = v.to_frame().unstack().unstack()  # Transpose
        v.index = [col]
        rows.append(v)

    return pd.concat(rows)

Run Code Online (Sandbox Code Playgroud)

Answer 1

jez*_*ael 5

看来你需要改变:

v = v.to_frame().unstack().unstack()

Run Code Online (Sandbox Code Playgroud)

至

v = v.to_frame().T

Run Code Online (Sandbox Code Playgroud)

或可能transpose最终DataFrame也加入rename的col:

df = pd.DataFrame({'ID':[1,1,3],
                   'E':[4,5,5],
                   'C':[7,8,9]})

print (df)
   C  E  ID
0  7  4   1
1  8  5   1
2  9  5   3

def unique_ids(df):
    rows = []
    id_col = 'ID'
    for col in sorted(c for c in df.columns if c != id_col):
        v = df.groupby(col)[id_col].nunique().describe().rename(col)
        rows.append(v)
    return pd.concat(rows, axis=1).T

print (unique_ids(df))
   count  mean       std  min   25%  50%   75%  max
C    3.0   1.0  0.000000  1.0  1.00  1.0  1.00  1.0
E    2.0   1.5  0.707107  1.0  1.25  1.5  1.75  2.0

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，9 月前
查看次数：	5349 次
最近记录：	8 年，9 月前