pandas DataFrame:选择一组列,包括一系列列

Osw*_*ldo 4 select r dataframe pandas

如果我有一个R data.frame df

colnames(df)
[1] "a" "b" "c" "d" "e"
,

我可以选择"a","c","d"和"e"列,如下所示:

df[ , c(1, 3:5)]

大熊猫有一个简单的等价物吗?我知道我可以使用

df.loc[:, ['a', 'c', 'd', 'e']]

这对于几列很好.

对于许多列序列,R代码仍然很简单

df2[ , c(1:10, 25:30, 40, 50:100)]

Phi*_*oud 7

更新:无需使用numpy.hstack,您可以numpy.r_按以下方式拨打电话

使用iloc+ numpy.r_:

In [20]: df = DataFrame(randn(10, 3), columns=list('abc'))

In [21]: df
Out[21]: 
          a         b         c
0  0.228163 -1.311485 -1.335604
1  0.292547 -1.636901  0.001765
2  0.744605 -0.325580  0.205003
3 -0.580471 -0.531553 -0.740697
4  0.250574  1.076019 -0.594915
5 -0.148449  0.076951 -0.653595
6 -1.065314 -0.166018 -1.471532
7  1.133336 -0.529738 -1.213841
8 -1.715281 -2.058831  0.113237
9 -0.382412 -0.072540  0.294853

[10 rows x 3 columns]

In [22]: df.iloc[:, r_[:2]]
Out[22]: 
          a         b
0  0.228163 -1.311485
1  0.292547 -1.636901
2  0.744605 -0.325580
3 -0.580471 -0.531553
4  0.250574  1.076019
5 -0.148449  0.076951
6 -1.065314 -0.166018
7  1.133336 -0.529738
8 -1.715281 -2.058831
9 -0.382412 -0.072540

[10 rows x 2 columns]
Run Code Online (Sandbox Code Playgroud)

要连接整数范围,请使用numpy.r_:

In [35]: df = DataFrame(randn(10, 6), columns=list('abcdef'))

In [36]: df.iloc[:, r_[:2, 2:df.columns.size:2]]
Out[36]: 
          a         b         c         e
0 -1.358623 -0.622909  0.025609 -1.166303
1  0.527027  0.310530  2.892384  0.190451
2 -0.251138 -1.246113  0.738264  0.062078
3 -1.716028  0.419139  0.060225 -1.191527
4 -1.308635  0.045396 -0.599367 -0.202491
5 -0.620343  0.796364 -0.008802  0.160020
6  0.199739  0.111816 -0.278119  1.051317
7 -0.311206  0.090348 -0.237887  0.958215
8  0.363161  2.449031  1.023352  0.743853
9  0.039451 -0.855733 -0.836921 -0.835078

[10 rows x 4 columns]
Run Code Online (Sandbox Code Playgroud)