拆分 pandas DataFrame 元素中的字符串并重新组合列表的一部分

Bil*_*ler 2 python dataframe pandas

我试图弄清楚如何在 pandas 元素中拆分字符串,然后重新组合拆分字符串的一部分。我有以下代码:

import pandas as pd

df = pd.DataFrame({'code': ['PC001-S002_D_CFI4-1_NN','PC001-S002_D_CFI4-1_NN','PC001-S002_D_CFI4-1_NN',
                            'PC001-S002_D_CFI4-1_ER','PC001-S002_D_CFI4-1_ER','PC001-S002_D_CFI4-1_ER']})

df['domain'] = df['code'].str.split("_")
Run Code Online (Sandbox Code Playgroud)

此代码用于根据下划线分割字符串。现在我想获取列中生成的拆分列表并重新组合前三个元素,以便:

PC001-S001_D_CFI4-1_NN ==> PC001-S001_D_CFI4-1

如果我只是使用以下方法应用于字符串,我可以这样做:

a = 'PC001-S002_D_CFI4-1_NN'
b = a.split("_")[0:3]
c = "_".join(b)
Run Code Online (Sandbox Code Playgroud)

然而,我尝试将其应用于熊猫,但没有取得太大成功。

任何建议都会受到极大的欢迎。

jez*_*ael 5

用于str[:3]选择前 3list秒,然后join

df['domain'] = df['code'].str.split("_").str[:3].str.join('_')
print (df)

                     code               domain
0  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
1  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
2  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
3  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
4  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
5  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
Run Code Online (Sandbox Code Playgroud)


Max*_*axU 5

您可以使用Series.str.rsplit(...)

In [11]: df['domain'] = df['code'].str.rsplit('_',1).str[0]

In [12]: df
Out[12]:
                     code               domain
0  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
1  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
2  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
3  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
4  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
5  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
Run Code Online (Sandbox Code Playgroud)

或者只是删除最后一部分:

In [7]: df['domain'] = df['code'].str.replace(r'\_\w+?$','')

In [8]: df
Out[8]:
                     code               domain
0  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
1  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
2  PC001-S002_D_CFI4-1_NN  PC001-S002_D_CFI4-1
3  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
4  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
5  PC001-S002_D_CFI4-1_ER  PC001-S002_D_CFI4-1
Run Code Online (Sandbox Code Playgroud)