如何删除特定列中具有NA的DataFrame的所有行?

Tho*_* W. 8 dataframe julia

删除具有NA特定列值的DataFrame中所有行的最优雅方法是什么?

jub*_*0bs 12

我不知道接下来是否是删除具有NA特定列的所有行的最优雅方式,但这是一种方式.

生成玩具DataFrame

julia> df = DataFrame(A = 1:10, B = 2:2:20)
10x2 DataFrame
| Row | A  | B  |
|-----|----|----|
| 1   | 1  | 2  |
| 2   | 2  | 4  |
| 3   | 3  | 6  |
| 4   | 4  | 8  |
| 5   | 5  | 10 |
| 6   | 6  | 12 |
| 7   | 7  | 14 |
| 8   | 8  | 16 |
| 9   | 9  | 18 |
| 10  | 10 | 20 |

julia> df[[1,4,8],symbol("B")] = NA
NA

julia> df
10x2 DataFrame
| Row | A  | B  |
|-----|----|----|
| 1   | 1  | NA |
| 2   | 2  | 4  |
| 3   | 3  | 6  |
| 4   | 4  | NA |
| 5   | 5  | 10 |
| 6   | 6  | 12 |
| 7   | 7  | 14 |
| 8   | 8  | NA |
| 9   | 9  | 18 |
| 10  | 10 | 20 |
Run Code Online (Sandbox Code Playgroud)

过滤掉其"B"-column元素所在的行NA

julia> df[~isna(df[:,symbol("B")]),:]
7x2 DataFrame
| Row | A  | B  |
|-----|----|----|
| 1   | 2  | 4  |
| 2   | 3  | 6  |
| 3   | 5  | 10 |
| 4   | 6  | 12 |
| 5   | 7  | 14 |
| 6   | 9  | 18 |
| 7   | 10 | 20 |

julia> df
10x2 DataFrame
| Row | A  | B  |
|-----|----|----|
| 1   | 1  | NA |
| 2   | 2  | 4  |
| 3   | 3  | 6  |
| 4   | 4  | NA |
| 5   | 5  | 10 |
| 6   | 6  | 12 |
| 7   | 7  | 14 |
| 8   | 8  | NA |
| 9   | 9  | 18 |
| 10  | 10 | 20 |
Run Code Online (Sandbox Code Playgroud)

删除"B"-column元素所在的行NA

julia> deleterows!(df,find(isna(df[:,symbol("B")])))
7x2 DataFrame
| Row | A  | B  |
|-----|----|----|
| 1   | 2  | 4  |
| 2   | 3  | 6  |
| 3   | 5  | 10 |
| 4   | 6  | 12 |
| 5   | 7  | 14 |
| 6   | 9  | 18 |
| 7   | 10 | 20 |

julia> df
7x2 DataFrame
| Row | A  | B  |
|-----|----|----|
| 1   | 2  | 4  |
| 2   | 3  | 6  |
| 3   | 5  | 10 |
| 4   | 6  | 12 |
| 5   | 7  | 14 |
| 6   | 9  | 18 |
| 7   | 10 | 20 |
Run Code Online (Sandbox Code Playgroud)


小智 6

Say df是您的DataFrame,A是缺少值的列.你可以做:

nonmissingrows = findin(isna(df[:A]), false)
df = df[nonmissingrows, :]
Run Code Online (Sandbox Code Playgroud)