确认两个pandas数据帧的相等性?

Meh*_*omi 18 python pandas

如何断言以下两个dataframes df1df2是平等的吗?

import pandas as pd
df1 = pd.DataFrame([1, 2, 3])
df2 = pd.DataFrame([1.0, 2, 3])
Run Code Online (Sandbox Code Playgroud)

输出df1.equals(df2)False.截至目前,我知道两种方式:

print (df1 == df2).all()[0]
Run Code Online (Sandbox Code Playgroud)

要么

df1 = df1.astype(float)
print df1.equals(df2)
Run Code Online (Sandbox Code Playgroud)

看起来有点凌乱.有没有更好的方法来进行这种比较?

Ale*_*der 26

您可以使用assert_frame_equal而不是检查列的dtype.

# Pre v. 0.20.3
# from pandas.util.testing import assert_frame_equal

from pandas.testing import assert_frame_equal

assert_frame_equal(df1, df2, check_dtype=False)
Run Code Online (Sandbox Code Playgroud)

  • 使用pandas 0.20.3`aspert_frame_equal`位于`pandas.testing`包中:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.testing.assert_frame_equal.html (7认同)

Max*_*axU 6

使用优雅@Divakar的想法- numpy 的allclose()将执行数字的主要技巧:

In [128]: df1
Out[128]:
   0    s  n
0  1  aaa  1
1  2  aaa  2
2  3  aaa  3

In [129]: df2
Out[129]:
     0    s    n
0  1.0  aaa  1.0
1  2.0  aaa  2.0
2  3.0  aaa  3.0

In [130]: (np.allclose(df1.select_dtypes(exclude=[object]), df2.select_dtypes(exclude=[object]))
   .....:  &
   .....:  df1.select_dtypes(include=[object]).equals(df2.select_dtypes(include=[object]))
   .....: )
Out[130]: True
Run Code Online (Sandbox Code Playgroud)

select_dtypes()将帮助您分隔字符串和所有其他数字类型