pandas:类型错误:不可散列的类型:“列表”

Luc*_*erz 7 python hash list duplicates pandas

我有以下内容df

df = pd.DataFrame(
    [
        [["John Muller"], "person", [8866155845]],
        [["Innovation Division"], "company", np.nan],
        [["Carol Sway"], "person", [8866155845]],
    ],
    columns=["name", "kind", "phone"],
)

# Out:
#                     name     kind         phone
# 0          [John Muller]   person  [8866155845]
# 1  [Innovation Division]  company           NaN
# 2           [Carol Sway]   person  [8866155845]
Run Code Online (Sandbox Code Playgroud)

我想查找重复的电话号码。但其中的对象df是列表,因此使用:

df.duplicated('phone') 
Run Code Online (Sandbox Code Playgroud)

将生成错误:

TypeError: unhashable type: 'list'
Run Code Online (Sandbox Code Playgroud)

YOL*_*OLO 4

您还可以使用applymap非常方便的函数来解决这个问题:

# get duplicated row
df2 = df[df.applymap(lambda x: x[0] if isinstance(x, list) else x).duplicated('phone')]

print(df2)

           name    kind         phone
2  [Carol Sway]  person  [8866155845]
Run Code Online (Sandbox Code Playgroud)