在熊猫中删除中文

5 python string replace dataframe pandas

我正在尝试从包含拉丁文和中文字符的 csv 中删除所有中文字符。数据看起来像:

    address                                                 lat
1   ?????, Zhangjiang, Pudong New District, 203718       31.204024
2   ??, 3057?, Jinke Road, Pudong, 201203, China          31.181804
Run Code Online (Sandbox Code Playgroud)

我需要它看起来像:

    address                                                 lat
1   , Zhangjiang, Pudong New District, 203718               31.204024
2   , 3057, Jinke Road, Pudong, 201203, China               31.181804
Run Code Online (Sandbox Code Playgroud)

我尝试过df.replace(/[^\x00-\x7F]/g, "")df.replace(/[\u{0080}-\u{FFFF}]/gu,"")但出现错误:

    df1.replace([^\x00-\x7F],"");
                 ^
SyntaxError: invalid syntax
Run Code Online (Sandbox Code Playgroud)

需要帮忙!谢谢

Max*_*axU 6

你就快到了:

df['address'] = df['address'].str.replace(r'[^\x00-\x7F]+', '')
Run Code Online (Sandbox Code Playgroud)

结果:

In [99]: df
Out[99]:
                                     address        lat
0  , Zhangjiang, Pudong New District, 203718  31.204024
1  , 3057, Jinke Road, Pudong, 201203, China  31.181804
Run Code Online (Sandbox Code Playgroud)