这是我的数据:
group <- c(1,1,1,1,2,2,2,3,3,4,4,4,4)
X1 <- c("A","A","A","A","B","A","B","A","A","B","B","B","B")
X2 <- c("A","A","A","A","B","B","B","A","A","B","B","A","A")
X3 <- c("B","A","A","A","B","B","B","B","B","B","B","B","B")
X4 <- c("A","A","A","B","B","B","A","A","A","B","A","B","B")
X5 <- c("A","A","A","A","B","B","B","A","A","A","B","B","B")
X6 <- c("A","A","A","A","B","A","B","A","A","B","B","A","A")
mydf <- data.frame (group, X1, X2, X3, X4, X5, X6)
Run Code Online (Sandbox Code Playgroud)
因此数据是:
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
2 1 A A A A A A
3 1 A A A A A A
4 1 A A A B A A
5 2 B B B B B B
6 2 A B B B B A
7 2 B B B A B B
8 3 A A B A A A
9 3 A A B A A A
10 4 B B B B A B
11 4 B B B A B B
12 4 B A B B B A
13 4 B A B B B A
Run Code Online (Sandbox Code Playgroud)
现在我需要将第一行与组中的其余行进行比较.
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
2 1 A A A A A A
TRUE TRUE FALSE TRUE TRUE TRUE
Run Code Online (Sandbox Code Playgroud)
这里的不匹配仅在X3处.1中6 = 1/6 = 17%
同样地,将3与第1组中的1st进行比较.
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
3 1 A A A A A A
Run Code Online (Sandbox Code Playgroud)
不匹配= 17%
同样将第4组与第1组进行比较.
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
4 1 A A A B A A
Run Code Online (Sandbox Code Playgroud)
不匹配= 2/6 = 34%
类似地,对于组2(组的第1行,即5组,6组)
group X1 X2 X3 X4 X5 X6
5 2 B B B B B B
6 2 A B B B B A
Run Code Online (Sandbox Code Playgroud)
不匹配= 2/6 = 34%
同理:
group X1 X2 X3 X4 X5 X6
5 2 B B B B B B
7 2 B B B A B B
Run Code Online (Sandbox Code Playgroud)
不匹配= 1/6 = 17%
我的试用版:
match (mydf[1,], mydf[2,])
match (mydf[1,], mydf[3,])
Run Code Online (Sandbox Code Playgroud)
试试这个:
match_ratio <- function(x)
cbind(x, match_ratio = rowMeans(mapply(`==`, x[1, -1], x[, -1])))
library(plyr)
ddply(mydf, "group", match_ratio)
# group X1 X2 X3 X4 X5 X6 match_ratio
# 1 1 A A B A A A 1.0000000
# 2 1 A A A A A A 0.8333333
# 3 1 A A A A A A 0.8333333
# 4 1 A A A B A A 0.6666667
# 5 2 B B B B B B 1.0000000
# 6 2 A B B B B A 0.6666667
# 7 2 B B B A B B 0.8333333
# 8 3 A A B A A A 1.0000000
# 9 3 A A B A A A 1.0000000
# 10 4 B B B B A B 1.0000000
# 11 4 B B B A B B 0.6666667
# 12 4 B A B B B A 0.5000000
# 13 4 B A B B B A 0.5000000
Run Code Online (Sandbox Code Playgroud)