Python Pandas:如何将一行移动到Dataframe的第一行?

Rex*_*Rex 9 python numpy dataframe pandas

给定已编制索引的现有Dataframe.

>>> df = pd.DataFrame(np.random.randn(10, 5),columns=['a', 'b', 'c', 'd', 'e'])
>>> df
          a         b         c         d         e
0 -0.131666 -0.315019  0.306728 -0.642224 -0.294562
1  0.769310 -1.277065  0.735549 -0.900214 -1.826320
2 -1.561325 -0.155571  0.544697  0.275880 -0.451564
3  0.612561 -0.540457  2.390871 -2.699741  0.534807
4 -1.504476 -2.113726  0.785208 -1.037256 -0.292959
5  0.467429  1.327839 -1.666649  1.144189  0.322896
6 -0.306556  1.668364  0.036508  0.596452  0.066755
7 -1.689779  1.469891 -0.068087 -1.113231  0.382235
8  0.028250 -2.145618  0.555973 -0.473131 -0.638056
9  0.633408 -0.791857  0.933033  1.485575 -0.021429
>>> df.set_index("a")
                  b         c         d         e
a                                                
-0.131666 -0.315019  0.306728 -0.642224 -0.294562
 0.769310 -1.277065  0.735549 -0.900214 -1.826320
-1.561325 -0.155571  0.544697  0.275880 -0.451564
 0.612561 -0.540457  2.390871 -2.699741  0.534807
-1.504476 -2.113726  0.785208 -1.037256 -0.292959
 0.467429  1.327839 -1.666649  1.144189  0.322896
-0.306556  1.668364  0.036508  0.596452  0.066755
-1.689779  1.469891 -0.068087 -1.113231  0.382235
 0.028250 -2.145618  0.555973 -0.473131 -0.638056
 0.633408 -0.791857  0.933033  1.485575 -0.021429
Run Code Online (Sandbox Code Playgroud)

如何将第3行移动到第一行?

这说,预期结果:

                  b         c         d         e
a                                                
-1.561325 -0.155571  0.544697  0.275880 -0.451564
-0.131666 -0.315019  0.306728 -0.642224 -0.294562
 0.769310 -1.277065  0.735549 -0.900214 -1.826320
 0.612561 -0.540457  2.390871 -2.699741  0.534807
-1.504476 -2.113726  0.785208 -1.037256 -0.292959
 0.467429  1.327839 -1.666649  1.144189  0.322896
-0.306556  1.668364  0.036508  0.596452  0.066755
-1.689779  1.469891 -0.068087 -1.113231  0.382235
 0.028250 -2.145618  0.555973 -0.473131 -0.638056
 0.633408 -0.791857  0.933033  1.485575 -0.021429
Run Code Online (Sandbox Code Playgroud)

现在原来的第一行应该成为第二行.

Ale*_*der 13

要将第三行移动到第一行,您可以创建一个将目标行移动到第一个元素的索引。我使用条件列表理解来按列表加入。

然后,只需使用iloc来选择所需的索引行。

np.random.seed(0)
df = pd.DataFrame(np.random.randn(5, 3),columns=['a', 'b', 'c'])
>>> df
          a         b         c
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
2  0.950088 -0.151357 -0.103219
3  0.410599  0.144044  1.454274
4  0.761038  0.121675  0.443863

target_row = 2
# Move target row to first element of list.
idx = [target_row] + [i for i in range(len(df)) if i != target_row]

>>> df.iloc[idx]
          a         b         c
2  0.950088 -0.151357 -0.103219
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
3  0.410599  0.144044  1.454274
4  0.761038  0.121675  0.443863
Run Code Online (Sandbox Code Playgroud)

如果需要,您还可以重置索引。

>>> df.iloc[idx].reset_index(drop=True)
          a         b         c
0  0.950088 -0.151357 -0.103219
1  1.764052  0.400157  0.978738
2  2.240893  1.867558 -0.977278
3  0.410599  0.144044  1.454274
4  0.761038  0.121675  0.443863
Run Code Online (Sandbox Code Playgroud)

或者,您可以使用idx以下命令重新索引列表:

>>> df.reindex(idx)
          a         b         c
2  0.950088 -0.151357 -0.103219
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
3  0.410599  0.144044  1.454274
4  0.761038  0.121675  0.443863
Run Code Online (Sandbox Code Playgroud)


小智 5

重新索引可能是在1个明显步骤中将行放入任何新顺序的最佳解决方案,除非它可能需要生成一个可能非常大的新DataFrame.

例如

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')
t
Out[81]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
1   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
2   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four

t.index
Out[82]: Int64Index([0, 1, 2, 3], dtype='int64')

t2 = t.reindex([2,0,1,3]) # cannot do this in place
t2
Out[93]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
2   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
0   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
1   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four
Run Code Online (Sandbox Code Playgroud)

现在可以将索引设置回范围(4)而无需重新索引:

t2.index=range(4)
Out[102]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
1   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
2   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four
Run Code Online (Sandbox Code Playgroud)

它也可以通过'元组切换'和行选择作为基本机制来完成,而无需创建新的DataFrame.例如:

import pandas as pd

t = pd.read_csv('table.txt',sep='\s+')

t.ix[1], t.ix[2] = t.ix[2], t.ix[1]
t.ix[0], t.ix[1] = t.ix[1], t.ix[0]  
t
Out[96]: 
  DG/VD   TYPE State Access Consist Cache sCC   Size Units   Name
0   2/2  RAID1  Optl     RW      No  RWTD   -  1.818    TB  three
1   0/0  RAID1  Optl     RW      No  RWTD   -  1.818    TB    one
2   1/1  RAID1  Optl     RW      No  RWTD   -  1.818    TB    two
3   3/3  RAID1  Optl     RW      No  RWTD   -  1.818    TB   four
Run Code Online (Sandbox Code Playgroud)

另一个in place方法为所需的排序设置DataFrame索引,以便例如第3行获得索引0等,然后DataFrame就地排序.它封装在以下函数中,该函数假设行的索引为正整数m的某个范围(m),并且DataFrame只是索引(没有MultiIndex),如问题中提供的示例所示.

def putfirst(n,df):
    if not isinstance(n, int):
        print 'error: 1st arg must be an int'
        return
    if n < 1:
        print 'error: 1st arg must be an int > 0'
        return
    if n == 1:
       print 'nothing to do when first arg == 1'
       return
    if n > len(df):
       print 'error: n exceeds the number of rows in the DataFrame'
       return
    df.index = range(1,n) + [0] + range(n,df.index[-1]+1)
    df.sort(inplace=True)
Run Code Online (Sandbox Code Playgroud)

putfirst的参数是n,它是要重新定位到第一行位置的行的序号位置,因此如果要重新定位第3行,则n = 3; 和df是包含要重定位的行的DataFrame.

这是一个演示:

import pandas as pd

df = pd.DataFrame(np.random.randn(10, 5),columns=['a', 'b', 'c', 'd', 'e'])

df.set_index("a") # ineffective without assignment or inplace=True
Out[182]: 
                  b         c         d         e
a                                                
 1.394072 -1.076742 -0.192466 -0.871188  0.420852
-1.211411 -0.258867 -0.581647 -1.260421  0.464575
-1.070241  0.804223 -0.156736  2.010390 -0.887104
-0.977936 -0.267217  0.483338 -0.400333  0.449880
 0.399594 -0.151575 -2.557934  0.160807  0.076525
-0.297204 -1.294274 -0.885180 -0.187497 -0.493560
-0.115413 -0.350745  0.044697 -0.897756  0.890874
-1.151185 -2.612303  1.141250 -0.867136  0.383583
-0.437030  0.347489 -1.230179  0.571078  0.060061
-0.225524  1.349726  1.350300 -0.386653  0.865990

df
Out[183]: 
          a         b         c         d         e
0  1.394072 -1.076742 -0.192466 -0.871188  0.420852
1 -1.211411 -0.258867 -0.581647 -1.260421  0.464575
2 -1.070241  0.804223 -0.156736  2.010390 -0.887104
3 -0.977936 -0.267217  0.483338 -0.400333  0.449880
4  0.399594 -0.151575 -2.557934  0.160807  0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745  0.044697 -0.897756  0.890874
7 -1.151185 -2.612303  1.141250 -0.867136  0.383583
8 -0.437030  0.347489 -1.230179  0.571078  0.060061
9 -0.225524  1.349726  1.350300 -0.386653  0.865990

df.index
Out[184]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

putfirst(3,df)
df
Out[186]: 
          a         b         c         d         e
0 -1.070241  0.804223 -0.156736  2.010390 -0.887104
1  1.394072 -1.076742 -0.192466 -0.871188  0.420852
2 -1.211411 -0.258867 -0.581647 -1.260421  0.464575
3 -0.977936 -0.267217  0.483338 -0.400333  0.449880
4  0.399594 -0.151575 -2.557934  0.160807  0.076525
5 -0.297204 -1.294274 -0.885180 -0.187497 -0.493560
6 -0.115413 -0.350745  0.044697 -0.897756  0.890874
7 -1.151185 -2.612303  1.141250 -0.867136  0.383583
8 -0.437030  0.347489 -1.230179  0.571078  0.060061
9 -0.225524  1.349726  1.350300 -0.386653  0.865990
Run Code Online (Sandbox Code Playgroud)