假设我有一个df列'ID', 'col_1', 'col_2'.我定义了一个函数:
f = lambda x, y : my_function_expression.
现在我想应用fto df的两列'col_1', 'col_2'来逐元素地计算一个新列'col_3',有点像:
df['col_3'] = df[['col_1','col_2']].apply(f)
# Pandas gives : TypeError: ('<lambda>() takes exactly 2 arguments (1 given)'
Run Code Online (Sandbox Code Playgroud)
怎么做 ?
** 添加详细示例如下 ***
import pandas as pd
df = pd.DataFrame({'ID':['1','2','3'], 'col_1': [0,2,3], 'col_2':[1,4,5]})
mylist = ['a','b','c','d','e','f']
def get_sublist(sta,end):
return mylist[sta:end+1]
#df['col_3'] = df[['col_1','col_2']].apply(get_sublist,axis=1)
# expect above to output df as below
ID col_1 col_2 col_3
0 1 0 …Run Code Online (Sandbox Code Playgroud) 我有一个 pandas 数据框,想要选择其中一列的值以另一列的值开头的行。我已经尝试过以下方法:
import pandas as pd
df = pd.DataFrame({'A': ['apple', 'xyz', 'aa'],
'B': ['app', 'b', 'aa']})
df_subset = df[df['A'].str.startswith(df['B'])]
Run Code Online (Sandbox Code Playgroud)
但它出错了,我发现的这个解决方案也没有帮助。
KeyError: "None of [Float64Index([nan, nan, nan], dtype='float64')] are in the [columns]"
Run Code Online (Sandbox Code Playgroud)
np.where(df['A'].str.startswith(df['B']), True, False)一切也从这里回归。True