例如,如果我有这样的家庭住址:
71 Pilgrim Avenue, Chevy Chase, MD
在名为"地址"的列中.我想将它分别分为"街道","城市","州"列.
使用Pandas实现这一目标的最佳方法是什么?
我试过了df[['street', 'city', 'state']] = df['address'].findall(r"myregex").
但我得到的错误是Must have equal len keys and value when setting with an iterable.
谢谢您的帮助 :)
jez*_*ael 16
您可以使用split正则表达式,\s+(,以及一个或多个空格):
#borrowing sample from `Allen`
df[['street', 'city', 'state']] = df['address'].str.split(',\s+', expand=True)
print (df)
address id street city \
0 71 Pilgrim Avenue, Chevy Chase, MD a 71 Pilgrim Avenue Chevy Chase
1 72 Main St, Chevy Chase, MD b 72 Main St Chevy Chase
state
0 MD
1 MD
Run Code Online (Sandbox Code Playgroud)
如果需要删除列address添加drop:
df[['street', 'city', 'state']] = df['address'].str.split(',\s+', expand=True)
df = df.drop('address', axis=1)
print (df)
id street city state
0 a 71 Pilgrim Avenue Chevy Chase MD
1 b 72 Main St Chevy Chase MD
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
12560 次 |
| 最近记录: |