我正在尝试按总列对数据框进行排序:
df.sort_values(by='Total', ascending=False, axis=0, inplace =True)
Run Code Online (Sandbox Code Playgroud)
但我收到以下警告:
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
"""Entry point for launching an IPython kernel.
Run Code Online (Sandbox Code Playgroud)
当我点击链接时,它会打开并.loc建议使用方法。但在那之后,我遵循了.sort_values(),我发现使用 inplace = False 或 None 。
我的问题是,如果我得到一个未排序的数据框列,并且如果我不使用 inplace = True,我的数据框将被排序以供进一步使用,或者我必须为数据框分配一个新名称并保存它。
警告不清楚,但如果在通过过滤另一个 df 创建 df 时将 .copy() 与 .loc 结合使用,则警告应该消失。
import pandas as pd
df = pd.DataFrame({'num':range(10),'Total':range(20,30)})
# loc without copy
df_2 = df.loc[df.num <5]
df_2.sort_values(by='Total', ascending=False, axis=0, inplace =True)
# leads to SettingWithCopyWarning
df_3 = df.loc[df.num <5].copy()
df_3.sort_values(by='Total', ascending=False, axis=0, inplace =True)
# no warning
Run Code Online (Sandbox Code Playgroud)
您会在这里找到更多详细信息,但是有一类非常烦人的 Pandas 错误,带有复制警告的设置试图保护您免受这些错误的影响。
df_4 = df.copy()
df_4['new_col'] = df_4.num *2
df_5 = df
df_5['new_col_2'] = df_5.num *3
# df_5's column is also added to df, but not df_4, because of .copy()
df.columns
#Index(['num', 'Total', 'new_col_2'], dtype='object')
df[df.num <2].loc[:,['Total']] = 100
df.Total.max()
# still 29, because of the chained .locs, Total was not updated.
df.loc[df.num<2,'Total'] = 100
df.Total.max()
# 100
Run Code Online (Sandbox Code Playgroud)