我在使用python中的pandas更改现有DataFrame中的标题行时遇到了问题.导入pandas和csv文件后,我将标题行设置为None,以便能够在转置后删除重复的日期.但是这给我留下了一个我不想要的行标题(实际上是一个索引列).
df = pd.read_csv(spreadfile, header=None)
df2 = df.T.drop_duplicates([0], take_last=True)
del df2[1]
indcol = df2.ix[:,0]
df3 = df2.reindex(indcol)
Run Code Online (Sandbox Code Playgroud)
然而,上述缺乏想象力的代码在两个方面都失败了.索引列现在是必需的列,但所有条目现在都是NaN.我对python的理解还不足以识别python正在做什么.以下所需的输出是我需要的,任何帮助将不胜感激!
重建索引前的df2:
0 2 3 4 5
0 NaN XS0089553282 XS0089773484 XS0092157600 XS0092541969
1 01-May-14 131.7 165.1 151.8 88.9
3 02-May-14 131 164.9 151.7 88.5
5 05-May-14 131.1 165 151.8 88.6
7 06-May-14 129.9 163.4 151.2 87.1
Run Code Online (Sandbox Code Playgroud)
重建索引后的df2:
0 2 3 4 5
0
NaN NaN NaN NaN NaN NaN
01-May-14 NaN NaN NaN NaN NaN
02-May-14 NaN NaN NaN NaN NaN
05-May-14 NaN NaN NaN NaN NaN
06-May-14 NaN NaN NaN NaN NaN
Run Code Online (Sandbox Code Playgroud)
df2期望:
XS0089553282 XS0089773484 XS0092157600 XS0092541969
01-May-14 131.7 165.1 151.8 88.9
02-May-14 131 164.9 151.7 88.5
05-May-14 131.1 165 151.8 88.6
06-May-14 129.9 163.4 151.2 87.1
Run Code Online (Sandbox Code Playgroud)
直接分配列:
indcol = df2.ix[:,0]
df2.columns = indcol
Run Code Online (Sandbox Code Playgroud)
问题reindex在于它将使用 df 的现有索引和列值,因此您传入的新列值不存在,因此为什么您得到所有NaNs
您尝试执行的操作的更简单方法:
In [147]:
# take the cols and index values of interest
cols = df.loc[0, '2':]
idx = df['0'].iloc[1:]
print(cols)
print(idx)
2 XS0089553282
3 XS0089773484
4 XS0092157600
5 XS0092541969
Name: 0, dtype: object
1 01-May-14
3 02-May-14
5 05-May-14
7 06-May-14
Name: 0, dtype: object
In [157]:
# drop the first row and the first column
df2 = df.drop('0', axis=1).drop(0)
# overwrite the index values
df2.index = idx.values
df2
Out[157]:
2 3 4 5
01-May-14 131.7 165.1 151.8 88.9
02-May-14 131 164.9 151.7 88.5
05-May-14 131.1 165 151.8 88.6
06-May-14 129.9 163.4 151.2 87.1
In [158]:
# now overwrite the column values
df2.columns = cols.values
df2
Out[158]:
XS0089553282 XS0089773484 XS0092157600 XS0092541969
01-May-14 131.7 165.1 151.8 88.9
02-May-14 131 164.9 151.7 88.5
05-May-14 131.1 165 151.8 88.6
06-May-14 129.9 163.4 151.2 87.1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7471 次 |
| 最近记录: |