我有一个与此类似的调查回复数据集:
toy <- data.frame(v1 = c(1,2,3), v2 = c(1,6,3), v3 = c(1,2,4), v4 = c(1,7,3))
toy
v1 v2 v3 v4
1 1 1 1 1
2 2 6 2 7
3 3 3 4 3
Run Code Online (Sandbox Code Playgroud)
我想通过查找每行最常见的值并计算具有该值的列的比例来检测“直线”。
两个例子:
期望的输出:
v1 v2 v3 v4 straightline_pct
1 1 1 1 1 1
2 2 6 2 7 .50
3 3 3 4 3 .75
Run Code Online (Sandbox Code Playgroud)
一种基本方法:
toy <- data.frame(v1 = c(1,2,3), v2 = c(1,6,3), v3 = c(1,2,4), v4 = c(1,7,3))
toy$straightline_pct = apply(as.matrix(toy),
1L,
function (x) max(prop.table(table(x)))
)
toy
#> v1 v2 v3 v4 straightline_pct
#> 1 1 1 1 1 1.00
#> 2 2 6 2 7 0.50
#> 3 3 3 4 3 0.75
Run Code Online (Sandbox Code Playgroud)