试图在netc df中创建一个新列,但我收到了警告
netc["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM
C:\Anaconda\lib\site-packages\ipykernel\__main__.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
Run Code Online (Sandbox Code Playgroud)
在新版本的Pandas中创建一个字段的正确方法是什么,以避免收到警告?
pd.__version__
Out[45]:
u'0.19.2+0.g825876c.dirty'
Run Code Online (Sandbox Code Playgroud)
Fil*_*rda 18
正如错误中所述,尝试使用.loc[row_indexer,col_indexer]
创建新列.
netc.loc[:,"DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM.
Run Code Online (Sandbox Code Playgroud)
通过Pandas Indexing Docs,您的代码应该可以运行.
netc["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM
Run Code Online (Sandbox Code Playgroud)
被翻译成
netc.__setitem__('DeltaAMPP', netc.LOAD_AM - netc.VPP12_AM)
Run Code Online (Sandbox Code Playgroud)
哪个应该有可预测的行为.在SettingWithCopyWarning
仅存在链式分配期间,警告的意外行为的用户(这是你做的不是).但是,如文档中所述,
有时
SettingWithCopy
,当没有明显的链式索引时,会出现警告.这些SettingWithCopy
是旨在捕获的错误!熊猫可能会试图警告你,你已经这样做了:
然后,文档继续给出一个示例,说明何时可能会出现错误,即使它不是预期的.所以我不知道为什么没有更多的背景会发生这种情况.
Mar*_*hke 18
在将数据分配给通过索引构造的SettingWithCopyWarning
DataFrame 时,我遇到了问题。df
两个命令
df['new_column'] = something
df.loc[:, 'new_column'] = something
没有警告就无法工作。一旦复制df
(DataFrame.copy())一切都很好。
在下面的代码中,比较df0 = df_test[df_test['a']>3]
和df1 = df_test[df_test['a']>3].copy()
。对于df0
这两个作业都会抛出警告。两者df1
都工作得很好。
>>> df_test
a b c d e
0 0.0 1.0 2.0 3.0 0
1 4.0 5.0 6.0 7.0 1
2 8.0 9.0 10.0 11.0 2
3 12.0 13.0 14.0 15.0 3
4 16.0 17.0 18.0 19.0 4
>>> df0 = df_test[df_test['a']>3]
>>> df1 = df_test[df_test['a']>3].copy()
>>> df0['e'] = np.arange(4)
__main__:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
>>> df1['e'] = np.arange(4)
>>> df0.loc[2, 'a'] = 77
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py:1719: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_column(loc, value, pi)
>>> df1.loc[2, 'a'] = 77
>>> df0
a b c d e
1 4.0 5.0 6.0 7.0 0
2 77.0 9.0 10.0 11.0 1
3 12.0 13.0 14.0 15.0 2
4 16.0 17.0 18.0 19.0 3
>>> df1
a b c d e
1 4.0 5.0 6.0 7.0 0
2 77.0 9.0 10.0 11.0 1
3 12.0 13.0 14.0 15.0 2
4 16.0 17.0 18.0 19.0 3
Run Code Online (Sandbox Code Playgroud)
顺便说一句:建议阅读有关此问题的文档(警告中的链接)
Your example is incomplete, as it doesn't show where netc
comes from. It is likely that netc itself is the product of slicing, and as such Pandas cannot make guarantees that it isn't a view or a copy.
For example, if you're doing this:
netc = netb[netb["DeltaAMPP"] == 0]
netc["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM
Run Code Online (Sandbox Code Playgroud)
then Pandas wouldn't know if netc
is a view or a copy. If it were a one-liner, it would effectively be like this:
netb[netb["DeltaAMPP"] == 0]["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM
Run Code Online (Sandbox Code Playgroud)
where you can see the double indexing more clearly.
If you want to make netc
separate from netb
, one possible remedy might be to force a copy in the first line (the loc
is to make sure we're not copying two times), like:
netc = netb.loc[netb["DeltaAMPP"] == 0].copy()
Run Code Online (Sandbox Code Playgroud)
If, on the other hand, you want to have netb
modified with the new column, you may do:
netb.loc[netb["DeltaAMPP"] == 0, "DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM
Run Code Online (Sandbox Code Playgroud)
小智 5
您需要在创建列时重置索引,特别是如果您对特定值进行了过滤......那么您不需要使用 .loc[row_indexer,col_indexer]
netc.reset_index(drop=True, inplace=True)
netc["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM
Run Code Online (Sandbox Code Playgroud)
然后它应该工作:)
归档时间: |
|
查看次数: |
7383 次 |
最近记录: |