如何删除 pandas 数据框中带有 NaN 的行？

Question

如何删除 pandas 数据框中带有 NaN 的行？

我有这个 pandas 数据框，它实际上是一个 Excel 电子表格：

    Unnamed: 0  Date    Num     Company     Link    ID
0   NaN     1990-11-15  131231  apple...    http://www.example.com/201611141492/xellia...   290834
1   NaN     1990-10-22  1231    microsoft http://www.example.com/news/arnsno...     NaN
2   NaN     2011-10-20  123     apple   http://www.example.com/ator...  209384
3   NaN     2013-10-27  123     apple...    http://example.com/sections/th-shots/2016/...   098
4   NaN     1990-10-26  123     google  http://www.example.net/business/Drugmak...  098098
5   NaN     1990-10-18  1231    google...   http://example.com/news/va-rece...  NaN
6   NaN     2011-04-26  546     amazon...   http://www.example.com/news/home/20160425...    9809

Run Code Online (Sandbox Code Playgroud)

我想删除列NaN中的所有行ID并重新索引“索引假想列”：

    Unnamed: 0  Date    Num     Company     Link    ID
0   NaN     1990-11-15  131231  apple...    http://www.example.com/201611141492/xellia...   290834
1   NaN     2011-10-20  123     apple   http://www.example.com/ator...  209384
2   NaN     2013-10-27  123     apple...    http://example.com/sections/th-shots/2016/...   098
3   NaN     1990-10-26  123     google  http://www.example.net/business/Drugmak...  098098
4   NaN     2011-04-26  546     amazon...   http://www.example.com/news/home/20160425...    9809

Run Code Online (Sandbox Code Playgroud)

我知道可以按如下方式完成此操作：

df = df['ID'].dropna()

Run Code Online (Sandbox Code Playgroud)

或者

df[df.ID != np.nan]

Run Code Online (Sandbox Code Playgroud)

或者

df = df[np.isfinite(df['ID'])]

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Run Code Online (Sandbox Code Playgroud)

或者

df[df.ID()]

Run Code Online (Sandbox Code Playgroud)

或者：

df[df.ID != '']

Run Code Online (Sandbox Code Playgroud)

进而：

df.reset_index(drop=True, inplace=True)

Run Code Online (Sandbox Code Playgroud)

但是，它并没有删除NaN中的ID. 我正在获取前一个数据框。

更新

在：

df['ID'].values

Run Code Online (Sandbox Code Playgroud)

出去：

array([ '....A lot of text....',
       nan,
       "A lot of text...",
       "More text",
       'text from the site',
       nan,
       "text from the site"], dtype=object)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mai*_*lam 5

尝试df.dropna(axis = 1)。

或者，df.dropna(axis = 0, subset = "ID")看看是否有帮助。

归档时间：	9 年，1 月前
查看次数：	9608 次
最近记录：	4 年，3 月前