在多列上过滤基于NA的数据帧

Question

在多列上过滤基于NA的数据帧

我有以下数据框让我们称之为df

id   type   company
1    NA      NA
2    NA      ADM
3    North   Alex
4    South   NA
NA   North   BDA
6    NA      CA

Run Code Online (Sandbox Code Playgroud)

我想只保留"类型"和"公司"栏中没有NA的记录

id   type   company
3    North   Alex
NA   North   BDA

Run Code Online (Sandbox Code Playgroud)

我累了

 df_non_na <- df[!is.na(df$company) || !is.na(df$type), ]

Run Code Online (Sandbox Code Playgroud)

但这没效果.

提前致谢

Answer 1

gra*_*der 14

你会想要使用drop_na()

library(dplyr)

new_df <- df %>% 
    drop_na(type, company)

Run Code Online (Sandbox Code Playgroud)

Answer 2

Mar*_*uer 10

dplyr带有(version >= 1.0.4) 和if_all(),的示例filter_at()已被取代

id <- c(1, 2, 3, 4, NA, 6)
type <- c(NA, NA, "North", "South", "North", NA)
company <- c(NA, "ADM", "Alex", NA, "BDA", "CA")

df <- tibble(id, type, company)

library(dplyr)

df_non_na <- df %>% filter(if_all(c(type,company), ~ !is.na(.)))

Run Code Online (Sandbox Code Playgroud)

Answer 3

akr*_*run 8

我们可以获取两列的逻辑索引,使用&和子集行.

df1[!is.na(df1$type) & !is.na(df1$company),]
# id  type company
#3  3 North    Alex
#5 NA North     BDA

Run Code Online (Sandbox Code Playgroud)

或者rowSums在逻辑矩阵(is.na(df1[-1]))上使用子集.

df1[!rowSums(is.na(df1[-1])),]

Run Code Online (Sandbox Code Playgroud)

Answer 4

dam*_*oni 5

您需要 AND 运算符 (&)，而不是 OR (|) 我还强烈建议使用 dplyr 函数 filter() 和管道运算符 %>% 的 tidyverse 方法，也来自 dplyr：

library(dplyr)
df_not_na <- df %>% filter(!is.na(company) & !is.na(type))

Run Code Online (Sandbox Code Playgroud)

Answer 5

Ric*_*kes 5

使用dplyr，您还可以使用该filter_at功能

library(dplyr)
df_non_na <- df %>% filter_at(vars(type,company),all_vars(!is.na(.)))

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，7 月前
查看次数：	12096 次
最近记录：	7 年前