use*_*059 5 python group-by dataframe pandas pandas-groupby
查询后该数据框为空df。当进行groupby时,发出运行时警告,然后又得到一个没有列的空数据框。如何保留这些列?
df = pd.DataFrame(columns=["PlatformCategory","Platform","ResClassName","Amount"])
print df
Run Code Online (Sandbox Code Playgroud)
结果:
Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []
Run Code Online (Sandbox Code Playgroud)
然后分组:
df = df.groupby(["PlatformCategory","Platform","ResClassName"]).sum()
df = df.reset_index(drop=False,inplace=True)
print df
Run Code Online (Sandbox Code Playgroud)
结果:有时为None有时为空数据框
Empty DataFrame
Columns: []
Index: []
Run Code Online (Sandbox Code Playgroud)
为什么空的数据框没有列。
运行时警告:
/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: divide by zero encountered in log
Run Code Online (Sandbox Code Playgroud)
如果alpha + beta * ngroups <count * np.log(count):
/data/pyrun/lib/python2.7/site-packages/pandas/core/groupby.py:3672: RuntimeWarning: invalid value encountered in double_scalars
if alpha + beta * ngroups < count * np.log(count):
Run Code Online (Sandbox Code Playgroud)
您需要as_index=False并且group_keys=False:
df = df.groupby(["PlatformCategory","Platform","ResClassName"], as_index=False).count()
df
Empty DataFrame
Columns: [PlatformCategory, Platform, ResClassName, Amount]
Index: []
Run Code Online (Sandbox Code Playgroud)
之后无需重置索引。
| 归档时间: |
|
| 查看次数: |
1916 次 |
| 最近记录: |