如何重置pandas数据框中的索引?

Rom*_*man 290 python indexing dataframe pandas

我有一个数据框,我从中删除了一些行.结果,我得到一个数据框,其中索引是这样的:[1,5,6,10,11]我想将其重置为[0,1,2,3,4].我该怎么做?


以下似乎有效:

df = df.reset_index()
del df['index']
Run Code Online (Sandbox Code Playgroud)

以下不起作用:

df = df.reindex()
Run Code Online (Sandbox Code Playgroud)

mkl*_*kln 585

reset_index()是你在找什么.如果您不希望将其另存为列,请执行以下操作:

df = df.reset_index(drop=True)
Run Code Online (Sandbox Code Playgroud)

  • 您可以设置`inplace = True`参数,而不是将数据框重新分配给同一个变量. (94认同)
  • 注意,在`inplace = True`的情况下,该方法返回None (23认同)
  • @Victor - 如果您不“删除”索引,它将添加一个新索引,并将旧索引值作为一系列保存在数据框中 (2认同)

jez*_*ael 42

另一种解决方案是分配RangeIndexrange:

df.index = pd.RangeIndex(len(df.index))

df.index = range(len(df.index))
Run Code Online (Sandbox Code Playgroud)

它更快:

df = pd.DataFrame({'a':[8,7], 'c':[2,4]}, index=[7,8])
df = pd.concat([df]*10000)
print (df.head())

In [298]: %timeit df1 = df.reset_index(drop=True)
The slowest run took 7.26 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 105 µs per loop

In [299]: %timeit df.index = pd.RangeIndex(len(df.index))
The slowest run took 15.05 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.84 µs per loop

In [300]: %timeit df.index = range(len(df.index))
The slowest run took 7.10 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 14.2 µs per loop
Run Code Online (Sandbox Code Playgroud)

  • @Outcast Source - 最快的是`len(df.index)`,381ns vs`df.shape`1.17us.Oyr缺少什么? (2认同)
  • 这是重置索引的优雅解决方案。谢谢你!我发现,如果您尝试将 hdf5 对象转换为 pandas.DataFrame 对象,则必须先重置索引,然后才能编辑 DataFrame 的某些部分。 (2认同)

小智 11

data1.reset_index(inplace=True)
Run Code Online (Sandbox Code Playgroud)