可视化两组数据之间的关联

3 r

其中每个数据点具有A和B的配对,并且在A中具有多个条目并且在B中有多个条目.IE多个校正子和多个诊断,尽管对于每个数据点,存在一个单一的校正子诊断对.

非常感谢的例子,建议或想法

这是数据的样子.我希望看到A和B的值之间的联系(有多少GG与TT相关联等).两者都是标称数据类型.

ID,A ,B 
1,GG,TT
2,AA,SS
3,BB,XX
4,DD,SS
5,DD,TT
6,CC,XX
7,HH,ZZ
8,AA,TT
9,CC,RR
10,DD,ZZ
11,AA,XX
12,AA,TT
13,DD,SS
14,DD,XX
15,AA,YY
16,CC,ZZ
17,FF,SS
18,FF,XX
19,BB,VV
20,GG,VV
21,GG,SS
22,AA,RR
23,AA,TT
24,AA,SS
25,CC,VV
26,CC,TT
27,FF,RR
28,GG,UU
29,CC,TT
30,BB,ZZ
31,II,TT
32,FF,RR
33,BB,SS
34,GG,YY
35,FF,RR
36,BB,VV
37,II,RR
38,CC,YY
39,FF,VV
40,AA,XX
41,AA,ZZ
42,GG,VV
43,BB,UU
44,II,UU
45,II,SS
46,DD,SS
47,AA,UU
48,BB,VV
49,GG,TT
50,BB,TT
Run Code Online (Sandbox Code Playgroud)

Jon*_*ang 7

由于您的数据是二分的,我建议在一侧绘制第一个因子中的点,在另一个上绘制另一个因子中的点,在它们之间用线条,如下所示:

在此输入图像描述

我用来生成这个的代码是:

## Make up data.
data <- data.frame(X1=sample(state.region, 10),
                   X2=sample(state.region, 10))

## Set up plot window.
plot(0, xlim=c(0,1), ylim=c(0,1),
     type="n", axes=FALSE, xlab="", ylab="")

factor.to.int <- function(f) {
  (as.integer(f) - 1) / (length(levels(f)) - 1)
}

segments(factor.to.int(data$X1), 0, factor.to.int(data$X2), 1,
         col=data$X1)
axis(1, at = seq(0, 1, by = 1 / (length(levels(data$X1)) - 1)),
     labels = levels(data$X1))
axis(3, at = seq(0, 1, by = 1 / (length(levels(data$X2)) - 1)),
     labels = levels(data$X2))
Run Code Online (Sandbox Code Playgroud)

  • 以下是将Jonathan的技术应用于示例数据集时应该得到的结果:http://i33.tinypic.com/5ey26o.png (3认同)

Thi*_*rry 5

这就是我的工作.颜色较深表示A和B的更重要组合.

dataset <- data.frame(A = sample(LETTERS[1:5], 200, prob = runif(5), replace = TRUE), B = sample(LETTERS[1:5], 200, prob = runif(5), replace = TRUE))
Counts <- as.data.frame(with(dataset, table(A, B)))
library(ggplot2)
ggplot(Counts, aes(x = A, y = B, fill = Freq)) + geom_tile() + scale_fill_gradient(low = "white", high = "black")
Run Code Online (Sandbox Code Playgroud)

或者如果你喜欢线条

library(ggplot2)
dataset <- data.frame(A = sample(letters[1:5], 200, prob = runif(5), replace = TRUE), B = sample(letters[1:5], 200, prob = runif(5), replace = TRUE))
Counts <- as.data.frame(with(dataset, table(A, B)))
Counts$X <- 0
Counts$Xend <- 1
Counts$Y <- as.numeric(Counts$A)
Counts$Yend <- as.numeric(Counts$B)
ggplot(Counts, aes(x = X, xend = Xend, y = Y, yend = Yend, size = Freq)) +
geom_segment() + scale_x_continuous(breaks = 0:1, labels = c("A", "B")) + 
scale_y_continuous(breaks = 1:5, labels = letters[1:5])
Run Code Online (Sandbox Code Playgroud)

第三个选项使用geom_text()向数据点添加标签.

library(ggplot2)
dataset <- data.frame(
    A = sample(letters[1:5], 200, prob = runif(5), replace = TRUE), 
    B = sample(LETTERS[20:26], 200, prob = runif(7), replace = TRUE)
)
Counts <- as.data.frame(with(dataset, table(A, B)))
Counts$X <- 0
Counts$Xend <- 1
Counts$Y <- as.numeric(Counts$A)
Counts$Yend <- as.numeric(Counts$B)
ggplot(Counts, aes(x = X, xend = Xend, y = Y, yend = Yend)) + 
geom_segment(aes(size = Freq)) + 
scale_x_continuous(breaks = 0:1, labels = c("A", "B")) + 
scale_y_continuous(breaks = -1) + 
geom_text(aes(x = X, y = Y, label = A), colour = "red", size = 7, hjust = 1, vjust = 1) + 
geom_text(aes(x = Xend, y = Yend, label = B), colour = "red", size = 7, hjust = 0, vjust = 0)
Run Code Online (Sandbox Code Playgroud)

  • 这是蒂埃里第二个解决方案的png.http://i38.tinypic.com/2d0juw6.png (4认同)