Pandas将列中的数字提取到新列中

Question

Pandas将列中的数字提取到新列中

我目前有这个df,其中rect列是所有字符串.我需要将x,y,w和h从中提取到单独的列中.数据集非常大,所以我需要一种有效的方法

df['rect'].head()
0    <Rect (120,168),260 by 120>
1    <Rect (120,168),260 by 120>
2    <Rect (120,168),260 by 120>
3    <Rect (120,168),260 by 120>
4    <Rect (120,168),260 by 120>

Run Code Online (Sandbox Code Playgroud)

到目前为止,这个解决方案有效,但是你可以看到它非常混乱

df[['x', 'y', 'w', 'h']] = df['rect'].str.replace('<Rect \(', '').str.replace('\),', ',').str.replace(' by ', ',').str.replace('>', '').str.split(',', n=3, expand=True)

Run Code Online (Sandbox Code Playgroud)

有没有更好的办法？可能是正则表达式方法

Answer 1

WeN*_*Ben 6

运用 extractall

df[['x', 'y', 'w', 'h']] = df['rect'].str.extractall('(\d+)').unstack().loc[:,0]
Out[267]: 
match    0    1    2    3
0      120  168  260  120
1      120  168  260  120
2      120  168  260  120
3      120  168  260  120
4      120  168  260  120

Run Code Online (Sandbox Code Playgroud)

Answer 2

piR*_*red 5

排队

制作副本

df.assign(**dict(zip('xywh', df.rect.str.findall('\d+').str)))

                          rect    x    y    w    h
0  <Rect (120,168),260 by 120>  120  168  260  120
1  <Rect (120,168),260 by 120>  120  168  260  120
2  <Rect (120,168),260 by 120>  120  168  260  120
3  <Rect (120,168),260 by 120>  120  168  260  120
4  <Rect (120,168),260 by 120>  120  168  260  120

Run Code Online (Sandbox Code Playgroud)

或者只是重新分配给 df

df = df.assign(**dict(zip('xywh', df.rect.str.findall('\d+').str)))

df

                          rect    x    y    w    h
0  <Rect (120,168),260 by 120>  120  168  260  120
1  <Rect (120,168),260 by 120>  120  168  260  120
2  <Rect (120,168),260 by 120>  120  168  260  120
3  <Rect (120,168),260 by 120>  120  168  260  120
4  <Rect (120,168),260 by 120>  120  168  260  120

Run Code Online (Sandbox Code Playgroud)

到位

修改现有的 df

df[[*'xywh']] = pd.DataFrame(df.rect.str.findall('\d+').tolist())

df

                          rect    x    y    w    h
0  <Rect (120,168),260 by 120>  120  168  260  120
1  <Rect (120,168),260 by 120>  120  168  260  120
2  <Rect (120,168),260 by 120>  120  168  260  120
3  <Rect (120,168),260 by 120>  120  168  260  120
4  <Rect (120,168),260 by 120>  120  168  260  120

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，3 月前
查看次数：	748 次
最近记录：	7 年，3 月前