ValueError:使用 pandas merge 时,您尝试合并 object 和 int64 列

11 python pandas

test.csv 数据如下:

device_id,upload_time,latitude,longitude,mileage,other_vals,speed,upload_time_1
11115304371,2020-08-05 05:10:05+00:00,23.140366,114.18685,0,,0,202008
1234,2020-08-05 05:10:33+00:00,22.994716,114.2998,0,,0,202008
11115304371,2020-08-05 05:20:55+00:00,22.994716,114.2998,0,,3.8,202008
11115304371,2020-08-05 05:24:02+00:00,22.994916,114.299683,0,,2.1,202008
11115304371,2020-08-05 05:24:30+00:00,22.99545,114.2998,0,,6.5,202008
11115304371,2020-08-05 05:29:30+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:34:30+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:39:30+00:00,22.995433,114.299766,0,,3.4,202008
822649e2d142a486,2020-08-05 05:44:30+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:44:53+00:00,22.995433,114.299766,0,,3.4,202008
11115304371,2020-08-05 05:45:40+00:00,22.995433,114.299766,0,,5.8,202008
Run Code Online (Sandbox Code Playgroud)

info.csv 数据如下:

car_id,device_id,car_type,car_num,marketer_name
1,11110110037,1,AAA,T1
2,11115304371,1,BBB,T2
3,11111100345,1,CCC,T3
4,11111100242,1,DDD,T4
5,12221100034,1,EEE,T5
6,12221100230,1,FFF,T6
7,14465301234,1,GGG,T7
Run Code Online (Sandbox Code Playgroud)

当我使用此代码合并 2 个数据帧时。

import pandas as pd

df_device_data = pd.read_csv(r'E:/test.csv', encoding='utf-8', parse_dates=[1], low_memory=False)
df_common_car_info = pd.read_csv(r'E:/info.csv', encoding='utf-8', low_memory=False)
result = pd.merge(df_device_data, df_common_car_info, how='left', on='device_id')
result.to_csv(r'E:/result.csv', index=False, mode='w', header=True)
Run Code Online (Sandbox Code Playgroud)

出现此错误:

ValueError:您正在尝试合并 object 和 int64 列。如果您想继续,您应该使用 pd.concat

如何修复它?

Piy*_*bhi 12

解决方案:
只需在代码中添加下面提到的行,它就会像魔术一样工作:)

df_device_data['device_id'] = df_device_data['device_id'].astype(str)
df_common_car_info['device_id'] = df_common_car_info['device_id'].astype(str)
Run Code Online (Sandbox Code Playgroud)

最终代码:

import pandas as pd

df_device_data = pd.read_csv(r'/home/piyushsambhi/Downloads/test.csv', encoding='utf-8', parse_dates=[1], low_memory=False)
df_common_car_info = pd.read_csv(r'/home/piyushsambhi/Downloads/info.csv', encoding='utf-8', low_memory=False)

df_device_data['device_id'] = df_device_data['device_id'].astype(str) #this line is not required as per your data and problem statement, but for just in case purpose. It is best to handle errors before they occur :)
df_common_car_info['device_id'] = df_common_car_info['device_id'].astype(str)

result = pd.merge(df_device_data, df_common_car_info, how='left', on='device_id')
result.to_csv(r'/home/piyushsambhi/Downloads/result.csv', index=False, mode='w', header=True)
Run Code Online (Sandbox Code Playgroud)

输出:

device_id,upload_time,latitude,longitude,mileage,other_vals,speed,upload_time_1,car_id,car_type,car_num,marketer_name
11115304371,2020-08-05 05:10:05,23.140366,114.18685,0,,0.0,202008,2.0,1.0,BBB,T2
1234,2020-08-05 05:10:33,22.994716,114.2998,0,,0.0,202008,,,,
11115304371,2020-08-05 05:20:55,22.994716,114.2998,0,,3.8,202008,2.0,1.0,BBB,T2
11115304371,2020-08-05 05:24:02,22.994916,114.299683,0,,2.1,202008,2.0,1.0,BBB,T2
11115304371,2020-08-05 05:24:30,22.99545,114.2998,0,,6.5,202008,2.0,1.0,BBB,T2
11115304371,2020-08-05 05:29:30,22.995433,114.29976599999999,0,,3.4,202008,2.0,1.0,BBB,T2
11115304371,2020-08-05 05:34:30,22.995433,114.29976599999999,0,,3.4,202008,2.0,1.0,BBB,T2
11115304371,2020-08-05 05:39:30,22.995433,114.29976599999999,0,,3.4,202008,2.0,1.0,BBB,T2
822649e2d142a486,2020-08-05 05:44:30,22.995433,114.29976599999999,0,,3.4,202008,,,,
11115304371,2020-08-05 05:44:53,22.995433,114.29976599999999,0,,3.4,202008,2.0,1.0,BBB,T2
11115304371,2020-08-05 05:45:40,22.995433,114.29976599999999,0,,5.8,202008,2.0,1.0,BBB,T2
Run Code Online (Sandbox Code Playgroud)


小智 6

当我使用这段代码时df.astype(str)

import pandas as pd

df_device_data = pd.read_csv(r'E:/test.csv', encoding='utf-8', parse_dates=[1], low_memory=False) 
df_device_data['device_id'] = df_device_data['device_id'].astype(str)
df_common_car_info = pd.read_csv(r'E:/info.csv', encoding='utf-8', low_memory=False) 
df_common_car_info['device_id'] = df_common_car_info['device_id'].astype(str)
result = pd.merge(df_device_data, df_common_car_info, how='left', on='device_id')
result.to_csv(r'E:/result.csv', index=False, mode='w', header=True)
Run Code Online (Sandbox Code Playgroud)

结果是对的。 在此输入图像描述