我有一个这样的DataFrame:
col1 col2 col3 col4 col5 col6 col7 col8
0 5345 rrf rrf rrf rrf rrf rrf
1 2527 erfr erfr erfr erfr erfr erfr
2 2727 f f f f f f
Run Code Online (Sandbox Code Playgroud)
我想重命名所有列,但不是重命名col1和col2.
所以我试着做一个循环
print(df.columns)
for col in df.columns:
if col != 'col1' and col != 'col2':
col.rename = str(col) + '_x'
Run Code Online (Sandbox Code Playgroud)
但它不是很有效......它不起作用!
A.K*_*Kot 13
您可以使用DataFrame.rename()方法
new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)
Run Code Online (Sandbox Code Playgroud)
如果col1和col2是第一列和第二列名称的最简单解决方案:
df.columns = df.columns[:2].union(df.columns[2:] + '_x')
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
Run Code Online (Sandbox Code Playgroud)
具有isin或列表理解的另一种解决方案:
cols = df.columns[~df.columns.isin(['col1','col2'])]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
Run Code Online (Sandbox Code Playgroud)
cols = [col for col in df.columns if col not in ['col1', 'col2']]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
print (df)
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
Run Code Online (Sandbox Code Playgroud)
最快的是列表理解:
df.columns = [col+'_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
Run Code Online (Sandbox Code Playgroud)
时间:
In [350]: %timeit (akot(df))
1000 loops, best of 3: 387 µs per loop
In [351]: %timeit (jez(df1))
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 207 µs per loop
In [363]: %timeit (jez3(df2))
The slowest run took 6.41 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 75.7 µs per loop
Run Code Online (Sandbox Code Playgroud)
df1 = df.copy()
df2 = df.copy()
def jez(df):
df.columns = df.columns[:2].union(df.columns[2:] + '_x')
return df
def akot(df):
new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)
return df
def jez3(df):
df.columns = [col + '_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
return df
print (akot(df))
print (jez(df1))
print (jez2(df1))
Run Code Online (Sandbox Code Playgroud)
您可以使用str.contains正则表达式模式来过滤感兴趣的列,然后使用zip构造一个字典并将其作为参数传递给rename:
In [94]:
cols = df.columns[~df.columns.str.contains('col1|col2')]
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
df
Out[94]:
col1 col2 col3_x col4_x col5_x col6_x col7_x col8_x
0 0 5345 rrf rrf rrf rrf rrf rrf
1 1 2527 erfr erfr erfr erfr erfr erfr
2 2 2727 f f f f f f
Run Code Online (Sandbox Code Playgroud)
因此,这里使用str.contains过滤列将返回不匹配的列,因此列顺序无关紧要
| 归档时间: |
|
| 查看次数: |
7317 次 |
| 最近记录: |