python pandas列dtype = object导致合并失败,原因是:DtypeWarning:列具有混合类型

jea*_*elj 5 python merge type-conversion pandas

我正在尝试合并列df1, df2上的两个数据框Customer_ID。两者似乎都Customer_ID具有相同的数据类型(object)。

df1:

Customer_ID |  Flag
12345           A
Run Code Online (Sandbox Code Playgroud)

df2:

Customer_ID | Transaction_Value
12345           258478
Run Code Online (Sandbox Code Playgroud)

当我合并两个表时:

new_df = df2.merge(df1, on='Customer_ID', how='left')
Run Code Online (Sandbox Code Playgroud)

对于某些Customer_ID,它起作用,而对于另一些,则无效。对于此示例,我将得到以下结果:

Customer_ID | Transaction_Value | Flag
    12345           258478         NaN
Run Code Online (Sandbox Code Playgroud)

我检查了数据类型,它们是相同的:

df1.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 873353 entries, 0 to 873352
Data columns (total 2 columns):
Customer_ID    873353 non-null object
Flag      873353 non-null object
dtypes: object(2)
memory usage: 20.0+ MB

df2.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 873353 entries, 0 to 873352
Data columns (total 2 columns):
Customer_ID    873353 non-null object
Transaction_Value      873353 int64
dtypes: object(2)
memory usage: 20.0+ MB
Run Code Online (Sandbox Code Playgroud)

当我上传df1时,确实收到了以下消息:

C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py:2717: DtypeWarning: Columns (1) have mixed types. Specify dtype option on import or set low_memory=False.
  interactivity=interactivity, compiler=compiler, result=result)
Run Code Online (Sandbox Code Playgroud)

当我想检查是否存在客户ID时,我意识到必须在两个数据框中以不同的方式指定它。

df1.loc[df1['Customer_ID'] == 12345]

df2.loc[df2['Customer_ID'] == '12345']
Run Code Online (Sandbox Code Playgroud)

piR*_*red 6

Customer_IDdtype==object在这两种情况下......但是,这并不意味着单个元素都是同一类型。您需要同时strint


使用 int

dtype = dict(Customer_ID=int)

df1.astype(dtype).merge(df2.astype(dtype), 'left')

   Customer_ID Flag  Transaction_Value
0        12345    A             258478
Run Code Online (Sandbox Code Playgroud)

使用 str

dtype = dict(Customer_ID=str)

df1.astype(dtype).merge(df2.astype(dtype), 'left')

   Customer_ID Flag  Transaction_Value
0        12345    A             258478
Run Code Online (Sandbox Code Playgroud)