Python:Dataframe apply不接受axis参数

sle*_*ile 6 python apply dataframe pandas

我有两个数据帧:datarules.

>>>data                            >>>rules
   vendor                             rule
0  googel                           0 google
1  google                           1 dell
2  googly                           2 macbook
Run Code Online (Sandbox Code Playgroud)

data 在计算每个供应商和规则之间的Levenshtein相似性之后,我试图在数据框中添加两个新列.所以我的数据框理想情况下应该包含如下所示的列:

>>>data
  vendor   rule    similarity
0 googel   google    0.8
Run Code Online (Sandbox Code Playgroud)

到目前为止,我正在尝试执行一个apply函数,它将返回给我这个结构,但dataframe apply不接受axis参数.

>>> for index,r in rules.iterrows():
...     data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
...
Traceback (most recent call last):

File "<stdin>", line 2, in <module>

File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2220, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/src/inference.pyx", line 1088, in pandas.lib.map_infer (pandas/lib.c:62658)
File "/home/mnnr/test/env/test-1.0/runtime/lib/python3.4/site-packages/pandas/core/series.py", line 2209, in <lambda>
f = lambda x: func(x, *args, **kwds)

TypeError: <lambda>() got an unexpected keyword argument 'axis'
Run Code Online (Sandbox Code Playgroud)

有人可以帮我弄清楚我做错了什么吗?我所做的任何改变只是创造了新的错误.谢谢

EdC*_*ica 6

你正在调用一个没有意义的Series版本,apply因为有一个axisarg因此错误.

如果你这样做:

data[['rule','similarity']]=data[['vendor']].apply(lambda row:[r[0],ratio(row[0],r[0])],axis=1)
Run Code Online (Sandbox Code Playgroud)

然后这会产生一个单独的列df,这将是有效的

或者只是删除axisarg:

data[['rule','similarity']]=data['vendor'].apply(lambda row:[r[0],ratio(row[0],r[0])])
Run Code Online (Sandbox Code Playgroud)

更新

看看你正在做什么,你需要计算每个规则对每个供应商的levenshtein比率.

你可以这样做:

data['vendor'].apply(lambda row: rules['rule'].apply(lambda x: ratio(x, row))
Run Code Online (Sandbox Code Playgroud)

我认为应该根据每个规则计算每个供应商的比率.