我有一个数据帧df:
20060930 10.103 NaN 10.103 7.981
20061231 15.915 NaN 15.915 12.686
20070331 3.196 NaN 3.196 2.710
20070630 7.907 NaN 7.907 6.459
Run Code Online (Sandbox Code Playgroud)
然后我想选择具有列表中指示的某些序列号的行,假设这里是[1,3],然后左:
20061231 15.915 NaN 15.915 12.686
20070630 7.907 NaN 7.907 6.459
Run Code Online (Sandbox Code Playgroud)
如何或有什么功能可以做到这一点?
Woo*_*ide 111
List = [1, 3]
df.ix[List]
Run Code Online (Sandbox Code Playgroud)
应该做的伎俩!当我使用数据框索引时,我总是使用.ix()方法.它更容易,更灵活......
更新
这不再是可接受的索引方法.该ix方法已弃用.使用.iloc基于整数索引和.loc基于标签索引.
Amr*_*ram 31
另一种方式(虽然它是一个更长的代码)但它比上面的代码更快.使用%timeit函数检查它:
df[df.index.isin([1,3])]
Run Code Online (Sandbox Code Playgroud)
PS:你找出原因
use*_*r42 10
如果index_list包含您想要的索引,您可以通过执行获得具有所需行的数据框
index_list = [1,2,3,4,5,6]
df.loc[df.index[index_list]]
Run Code Online (Sandbox Code Playgroud)
这是基于截至 2021 年 3 月的最新文档。
For large datasets, it is memory efficient to read only selected rows via the skiprows parameter.
Example
pred = lambda x: x not in [1, 3]
pd.read_csv("data.csv", skiprows=pred, index_col=0, names=...)
Run Code Online (Sandbox Code Playgroud)
This will now return a DataFrame from a file that skips all rows except 1 and 3.
Details
From the docs:
skiprows: list-like or integer or callable, defaultNone...
If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be
lambda x: x in [0, 2]
This feature works in version pandas 0.20.0+. See also the corresponding issue and a related post.