我有一个data.frame:
df <- data.frame(id = rep(1:4, each = 3),x = c("A","B","C","D","E","A","A","C","D","A","C","E"))
Run Code Online (Sandbox Code Playgroud)
我想计算每个id内的连接:这是我想得到的输出:
connections |num. of connections
A - B | 1
B - C | 1
C - D | 1
A - C | 3
A - E | 2
A - D | 2
D - E | 1
C - E | 1
Run Code Online (Sandbox Code Playgroud)
怎么在dplyr中做到这一点?
听起来你只是在找crossprod功能,你可以像这样使用:
crossprod(table(df))
# x
# x A B C D E
# A 4 1 3 2 2
# B 1 1 1 0 0
# C 3 1 3 1 1
# D 2 0 1 2 1
# E 2 0 1 1 2
Run Code Online (Sandbox Code Playgroud)
这将使您更接近您想要的输出:
library(reshape2)
X <- crossprod(table(df))
X[upper.tri(X, diag = TRUE)] <- NA
melt(X, na.rm = TRUE)
# x x value
# 2 B A 1
# 3 C A 3
# 4 D A 2
# 5 E A 2
# 8 C B 1
# 9 D B 0
# 10 E B 0
# 14 D C 1
# 15 E C 1
# 20 E D 1
Run Code Online (Sandbox Code Playgroud)