我有这段代码,它通过从现有列中提取信息来操作数据集以创建新列。为了使用 pd.merge 函数将数据与另一个数据集正确匹配,我想将“通道 ID”列转换为整数。尽管当前使用 .astype(int),但结果数据类型显示为 float64,查看带有 .info() 的帧
def cost(received_frame):
received_frame.columns = ['Campaign', 'Ad Spend']
campaigns = received_frame['Campaign']
ID = []
for c in campaigns:
blocks = re.split('_', c)
for block in blocks[1:]:
if len(block) == 6 and block.isdigit():
ID.append(block)
ID = pd.Series(ID).str.replace("'","")
ID = pd.DataFrame(ID)
both = [ID,received_frame]
frame = pd.concat(both,axis=1)
frame.columns = ['Channel ID', 'Campaign', 'Ad Spend']
frame['Channel ID'] = frame['Channel ID'].dropna().astype(int)
return frame
Run Code Online (Sandbox Code Playgroud)
当你写
frame['Channel ID'].dropna().astype(int)
Run Code Online (Sandbox Code Playgroud)
当您删除 NA 时,您将返回一个索引可能更少的系列。
然后,当您将其分配为
frame['Channel ID'] = frame['Channel ID'].dropna().astype(int)
Run Code Online (Sandbox Code Playgroud)
它与现有值(根据索引)执行某种合并,这些值是浮点数,因此它也必须转换这些值。
您应该用其他东西替换它,具体取决于您的问题(fillna?)。