从 Pandas 中的某些列名中删除后缀

Question

从 Pandas 中的某些列名中删除后缀

这是我的 df：

df = pd.DataFrame({'a_x':[1, 2, 3], 'b_x':[20, 30, 40], 'c_x':[10, 40, 50]})

Run Code Online (Sandbox Code Playgroud)

如何删除第一列和第三列名称的后缀？

我不想要这种方法： df = df.rename(columns={'a_x':'a', 'c_x':'c'})将它们一一硬编码。

编辑 1：我有要从中删除后缀的列的列表。在这种情况下，我有['a', 'c']

我想要的结果是这样的：

    a    b_x   c
0    1   20   10
1    2   30   40
2    3   40   50

Run Code Online (Sandbox Code Playgroud)

Answer 1

Dat*_*ice 6

我认为一个简单if/else的列表理解就可以在这里完成。

import pandas as pd 

trg_cols = ['a_x','c_x']
new_cols = [col.split('_')[0] if col in trg_cols else col  for col in df.columns]

Run Code Online (Sandbox Code Playgroud)

df.columns = new_cols 
print(df)
   a  b_x   c
0  1   20  10
1  2   30  40
2  3   40  50

Run Code Online (Sandbox Code Playgroud)

或者如果您想按输出目标列进行选择。

trg_cols = ['a','c']
new_cols = [col if not col.split('_')[0] in trg_cols else col.split('_')[0] 
            for col in df.columns]


df.columns = new_cols
print(df)
   a  b_x   c
0  1   20  10
1  2   30  40
2  3   40  50

Run Code Online (Sandbox Code Playgroud)

使用fillna和pd.Series

trg_cols = ['a','c']
s = pd.Series(df.columns)
df.columns = s.str.extract('('+'|'.join(trg_cols)+')',expand=False).fillna(s)


print(df)
   a  b_x   c
0  1   20  10
1  2   30  40
2  3   40  50

Run Code Online (Sandbox Code Playgroud)

Answer 2

Cyt*_*rak 5

您可以定义您的 cols，并重命名：

>>> target_cols = ['c']
>>> suffix = '_x'
>>> df.rename(columns={col + suffix: col for col in target_cols})

   a_x  b_x   c
0    1   20  10
1    2   30  40
2    3   40  50

# or
# df.rename(columns=dict(zip((col + suffix for col in target_cols), target_cols)))

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，7 月前
查看次数：	108 次
最近记录：	4 年，7 月前