Pandas数据帧无法将列数据类型从对象转换为字符串以进行进一步操作

Shy*_*nti 5 python type-conversion pandas

这是我的工作代码,它是从网站下载excel文件.大约需要40秒.

运行此代码后,您会注意到Key1,Key2和Key3列是对象dtypes.我清理了数据框,使key1和key2只有字母数字值.仍然是熊猫将它作为对象dtype.我需要连接(如在MS Excel中)Key1和Key2来创建一个名为deviceid的单独列.我意识到我不能加入这两列,因为它们是对象dtypes.我如何转换为字符串,以便我可以创建我的新列?

import pandas as pd
import urllib.request
import time

start=time.time()
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170816_da_bcsf.xls"
cnstsfxls = urllib.request.urlopen(url)
xlsf = pd.ExcelFile(cnstsfxls)
dfsf = xlsf.parse("Sheet1",skiprows=3)
dfsf.drop(dfsf.index[len(dfsf)-1],inplace=True)
dfsf.drop(dfsf[dfsf['Device Type'] == 'UN'].index, inplace=True)
dfsf.drop(dfsf[dfsf['Device Type'] == 'UNKNOWN'].index, inplace=True)
dfsf.drop(['Constraint Name','Contingency Name', 'Constraint Type','Flowgate Name'],axis=1, inplace=True)
end=time.time()
print("The entire process took - ", end-start, " seconds.")
Run Code Online (Sandbox Code Playgroud)

ves*_*and 0

我可能没有抓住重点。但是,如果您想要做的是构造一个列,例如,deviceid = RCH417whenKey1 = RCHKey2 = 417,那么dfsf['deviceid'] = dfsf['Key1'] + dfsf['Key2']即使两列都是对象类型,也可以正常工作。

尝试这个:

# Check value types
dfsf.dtypes

# Add your desired column
dfsf['deviceid'] = dfsf['Key1']  + dfsf['Key2']

# Inspect columns of interest
keep = ['Key1', 'Key2', 'deviceid']
df_keys = dfsf[keep]
print(df_keys.dtypes)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

print(df_keys.head())
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述