如何在python pandas中将两列与if/else组合?

poc*_*ese 8 python pandas

我对熊猫很新(即不到2天).但是,我似乎无法找出将两列与if/else条件组合的正确语法.

实际上,我确实找到了一种使用'zip'来实现它的方法.这就是我想要实现的目标,但似乎可能有更有效的方法在熊猫中做到这一点.

为了完整起见,我提供了一些预处理,以便清楚地说明:

records_data = pd.read_csv(open('records.csv'))

## pull out a year from column using a regex
source_years = records_data['source'].map(extract_year_from_source) 

## this is what I want to do more efficiently (if its possible)
records_data['year'] = [s if s else y for (s,y) in zip(source_years, records_data['year'])]
Run Code Online (Sandbox Code Playgroud)

Jef*_*eff 11

在pandas> = 0.10.0试试

df['year'] = df['year'].where(source_years!=0,df['year'])
Run Code Online (Sandbox Code Playgroud)

并看到:

http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-where-method-and-masking

正如评论中所指出的,这个DOES使用np.where引擎盖 - 不同之处在于pandas将系列与输出对齐(例如,你只能进行部分更新)


unu*_*tbu 8

也许试试np.where:

import numpy as np
df['year'] = np.where(source_years,source_years,df['year'])
Run Code Online (Sandbox Code Playgroud)