小编Chr*_*ris的帖子

按列分组的非NA记录数

我有一个data.table,看起来像这样:

> dt <- data.table(
  group1 = c("a", "a", "a", "b", "b", "b", "b"),
  group2 = c("x", "x", "y", "y", "z", "z", "z"),
  data1 = c(NA, rep(T, 3), rep(F, 2), "sometimes"),
  data2 = c("sometimes", rep(F,3), rep(T,2), NA))

> dt

   group1 group2     data1     data2
1:      a      x        NA sometimes
2:      a      x      TRUE     FALSE
3:      a      y      TRUE     FALSE
4:      b      y      TRUE     FALSE
5:      b      z     FALSE      TRUE
6:      b      z     FALSE      TRUE
7:      b      z sometimes        NA
Run Code Online (Sandbox Code Playgroud)

我的目标是找到每个数据列中的非NA记录数,按group1 …

r data.table

4
推荐指数
1
解决办法
102
查看次数

组合包含NA的data.table列

我在数据表中有一组五列.

dt <- data.table(
  city = c(rep(1,2), rep(2,2), rep(3,2), rep(4,2)),
  neighborhoods.1 = c(NA, "a", "b", "c", NA, NA, "d", "e"),
  neighborhoods.2 = c(NA, "f", "g", rep(NA,5)),
  neighborhoods.3 = c(NA, "h", rep(NA, 6)),
  irrelevantdata = c(1:8)
)

   city neighborhoods.1 neighborhoods.2 neighborhoods.3 irrelevantdata
1:    1              NA              NA              NA              1
2:    1               a               f               h              2
3:    2               b               g              NA              3
4:    2               c              NA              NA              4
5:    3              NA              NA              NA              5
6:    3              NA              NA              NA              6
7:    4               d …
Run Code Online (Sandbox Code Playgroud)

r data.table

4
推荐指数
1
解决办法
92
查看次数

从data.table中删除所有名称包括“ question”的列均不适用的行

我有一项调查得出的数据,对于该问题的任何回答都被认为是有效的,无论是回答之前还是之后的问题。

所有用于响应的数据都在data.table中,其名称以“ question”开头的列中

> dt.x <- data.table(
    row = 1:5,
    question_a = c(NA,NA,"A","B","C"),
    question_b = c(NA,"A","B","C","D")
)

> dt.x
   row question_a question_b
1:   1       <NA>       <NA>
2:   2       <NA>          A
3:   3          A          B
4:   4          B          C
5:   5          C          D
Run Code Online (Sandbox Code Playgroud)

我的目标是删除以“问题”开头的任何列中都没有数据的行,但是其他列中可能有数据,例如示例中的行列。

   row question_a question_b
1:   2       <NA>          A
2:   3          A          B
3:   4          B          C
4:   5          C          D
Run Code Online (Sandbox Code Playgroud)

在列名中添加grep时该如何做?我正在尝试类似

> dt.x[!all(is.na(get(grep("question", names(dt.x), value = T))))]
   row question_a question_b
1:   1       <NA>       <NA>
2:   2       <NA>          A …
Run Code Online (Sandbox Code Playgroud)

r data.table

0
推荐指数
1
解决办法
38
查看次数

标签 统计

data.table ×3

r ×3