H K*_*H K 4 visualization r ggplot2 plotly
我想创建一个工作相似性可视化,看起来与此处找到的类似: https: //www.irecsolarcareermap.org/。
最初,我尝试使用力网络,但我注意到此函数不提供分类 X 轴和 Y 轴。
这次,我尝试使用 ggplot。基本的可视化看起来不错,但我想在单击代表职位的圆圈时添加指向类似职位的链接。
“data.csv”文件包含 Occ1、Occ2、完全可转移性(相似性级别)和职业(用于合并)等列。它有超过 10,000 行,我需要将每个 Occ1 到 Occ2 进行匹配。
职位:汽车工程技术员应具有电子/电气装配工和电子装配工的线路(链接)
此外,“experience.csv”文件包含“Occupation”和“Strata.Level”列,共 126 行。“jobType”文件包含 Occupation 和 Job_type 列,共 176 行。
如果一份工作不属于经验和工作类型,我想将其删除。
我尝试过的是:
library(ggplot2)
library(plotly)
# Read the data
data <- read.csv("data.csv")
experience <- read.csv("experience.csv")
jobType <- read.csv("JobType.csv")
# Filter the data based on Full Transferability
filtered_data <- subset(data, Full.Transferability >= 0.9)
# Get all unique occupations from filtered data, jobType, and experience
all_occupations <- unique(c(filtered_data$Occ1, filtered_data$Occ2, jobType$Occupation, experience$Occupation))
# Create nodes dataframe with x and y coordinates
nodes <- data.frame(
name = all_occupations,
x = jobType$Job_type[match(all_occupations, jobType$Occupation)],
y = experience$Strata.Level[match(all_occupations, experience$Occupation)]
)
# Remove rows with missing x or y values
nodes <- nodes[complete.cases(nodes$x, nodes$y), ]
# Create a scatterplot with jittering
gg <- ggplot(nodes, aes(x = x, y = y, text = name)) +
geom_jitter(width = 0.2, height = 0.2, size = 1, color = "steelblue") +
labs(x = "Job Type", y = "Experience Level") +
theme_minimal() +
theme(panel.grid = element_blank()) +
geom_hline(aes(yintercept = y), color = "gray", linetype = "dashed") +
geom_vline(aes(xintercept = x), color = "gray", linetype = "dashed") +
coord_cartesian(clip = "off") +
theme(plot.margin = margin(20, 20, 20, 20))
# Convert the ggplot to a plotly object
p <- ggplotly(gg)
# Register click event handler
event_register(p, "plotly_click")
# Define JavaScript function to handle the click event
js <- "
function(eventData) {
var selectedJob = eventData.points[0].text;
alert('Selected job: ' + selectedJob);
}
"
# Add the JavaScript function to the plot
p <- htmlwidgets::prependContent(p, htmltools::tags$script(js))
p
Run Code Online (Sandbox Code Playgroud)
到目前为止,我已经得到了这个初步结果。
# Define JavaScript function to handle the click event
Run Code Online (Sandbox Code Playgroud)
无需出现弹出消息。我只是想在点击职位时显示职位名称。
您知道如何使职位可点击并将其链接到类似的职位吗?”
此更新添加了满足您最新请求的功能。
\nunhover)mousemove\n最多:1 组线条和 2 个工具提示;点击激活;双击清除
\np %>% htmlwidgets::onRender(\n "function(el, x) {\n nms = [\'curveNumber\', \'pointNumber\'];\n coll = []; /* for persistent tooltip */\n giveMe = []; /* for connected data points */\n oArr = el.data[0]; /* the x, y data for the scatter trace */\n redu = function(val, arr) { /* closest data point in array*/\n return arr.reduce((these, those) => {\n return Math.abs(those - val) < Math.abs(these - val) ? those : these;\n });\n }\n closest = function(xval, yval) { /* p.xvals/yvals from pt data; arr is x/y data obj */\n /* id nearest x and nearest y, make sure they match, if no match, take larger index */\n xpt = redu(xval, oArr.x); /* get closest data point for x axis*/\n ypt = redu(yval, oArr.y); /* get closest data point for y axis*/\n xi = oArr.x.indexOf(xpt); /* get index value for x data point */\n yi = oArr.x.indexOf(ypt); /* get index value for x data point */\n return xi > yi ? xi : yi; /* if the indices != return larger # */\n }\n el.on(\'plotly_hover\', function(p) {\n pt = p; /* global: for use in unhover */\n })\n el.on(\'plotly_unhover\', function(p) { /* create persistent tooltips */\n if(coll.length > 0){ /* if click occurred else no persistence */\n if(giveMe.length < 1) return; /* are there lines connecting points? */\n if(!Array.isArray(giveMe)) giveMe = [giveMe]; /* make sure its an array */\n whatNow = closest(pt.xvals[0], pt.yvals[0]); /* mouse on connected point? */\n if(giveMe.includes(whatNow)) { /* if hover pointIndex is connected */\n coll[1] = whatNow; /* add connected point to array for tips */\n hvr = []; /* clear array for curve & point list */\n for(ea in coll) { /* create list for hovering */ \n var oj = {}; oj[nms[0]] = 0; \n oj[nms[1]] = coll[ea]; \n hvr.push(oj);\n }\n } else {\n hvr = [{\'curveNumber\': 0, \'pointNumber\': coll[0]}]; /* if coll, create tooltip */\n }\n Plotly.Fx.hover(el, hvr); /* persistent tooltips */\n } \n })\n el.on(\'plotly_click\', function(p) { /* create persistent lines upon click */\n /* if any lines already vis-- hide them */\n Plotly.restyle(el, {\'visible\': false}, pt.xaxes[0]._traceIndices.slice(1,));\n giveIt = p.points[0].pointIndex; /* capture scatter index for curve number */\n if(p.points[0].customdata) {\n giveMe = p.points[0].customdata; /* get point\'s array of customdata */\n } else {giveMe = []}\n coll[0] = giveIt; /* collect index for persistent tooltip */\n Plotly.restyle(el, {\'visible\': true}, [giveIt + 1]);\n })\n el.on(\'plotly_doubleclick\', function(p) { /* remove lines & pers tooltips */\n Plotly.restyle(el, {\'visible\': false}, pt.xaxes[0]._traceIndices.slice(1,));\n coll = []; /* reset arrays, until next double click */\n giveMe = [];\n }) \n }")\nRun Code Online (Sandbox Code Playgroud)\n这是对该代码中发生的情况的解释(一般而言)。
\ncoll将包含持久工具提示的点索引
giveMe将包含连接数据点的索引(customdata添加到图中的索引)
oArr、redu()、 &closest()用于计算最近的数据点(当您使工具提示持久化时,Plotly 不会识别或计算新的悬停点,但它仍然捕获屏幕位置。
Onhover仅捕获悬停数据作为全局变量。悬停数据包含鼠标的屏幕位置。
任何可见的线条都click被删除;单击的点将成为持久的工具提示;线条被绘制到连接的数据上。此外,还捕获了连接的数据点索引customdata(这是giveMe)。giveMe被用在unhover.
在 上unhover,如果选择了原点(单击了一个点并且图形上有线条),则...(如果没有发生单击,则此函数不执行任何操作)。如果giveMe为空,则表明没有连接数据(没有线条)——不会创建持久工具提示。如果存在连接,则计算所有鼠标移动以确定鼠标是否位于由线连接的数据点上方。(幕后发生了很多事情。)此函数使用oArr、redu、 和closest来实现此目的,并在满足条件时创建第二个工具提示。
在 上doubleclick,线条和工具提示的持久性被删除。
我创建了两个新选项。第一个是你所要求的。然而,它非常笨重。我想你可能更喜欢第二种选择。
\n我注意到,当我创建nodes3数据时,并没有按照我的预期创建数据。这也让我发现了一些弱点lapply。这些也在这里修复。
有lapply针对第二个选项的更新,但无论您使用哪个选项,它都会起作用。
# create a simulation of jobs that match\nnodes3 <- lapply(1:nrow(nodes), function(k) {\n thisOne <- nodes$name[k]\n mtch <- nodes$name[\n grep(pattern = paste0("^", substr(thisOne, 1, 1)), nodes$name)]\n mtch <- mtch[!mtch %in% thisOne]\n if(length(mtch) < 1) {\n data.frame(occ1 = character(), occ2 = character(), # if no matches\n x = factor(), y = factor())\n } else {\n data.frame(occ1 = rep(thisOne, length(mtch)), occ2 = mtch, # if matches\n x = nodes$x[k], y = nodes$y[k])\n }\n }) %>% bind_rows()\n\ncdt = list() # list for the connected data point indices (used for 2nd option)\n\n# retain order of points in lines\' traces\ninvisible(lapply(1:nrow(df3), function(j) {\n dt <- df3[j, ] # point the lines will originate from\n mtch <- nodes3 %>% \n filter(x == dt$x1, y == dt$y1, occ1 == dt$nm) %>% # matching occ2\n select(occ2) %>% unlist(use.names = F)\n nodes4 <- df3[df3$nm %in% mtch, ] # extract matched x, y positions\n if(nrow(nodes4) < 1) {\n p <<- p %>% # create trace so indices remain correct!\n add_lines(x = rep(df3[j, ]$x, 2), y = rep(df3[j, ]$y, 2), visible = F) # create lines\n return() # if no similar occupations\n }\n # create segment vectors for x and y\n xs <- lapply(1:nrow(nodes4), function(m) {c(dt$x, nodes4[m, ]$x, NA)}) %>% unlist()\n ys <- lapply(1:nrow(nodes4), function(m) {c(dt$y, nodes4[m, ]$y, NA)}) %>% unlist()\n \n # get row numbers of connected data\n vect <- which(df3$x %in% nodes4$x & df3$y %in% nodes4$y)\n cdt[[j]] <<- vect - 1 # 0 ind in JS, so subtract one from every value\n p <<- p %>% \n add_lines(x = xs, y = ys, visible = F) # create lines\n}))\np\n\np$x$data[[1]]$customdata <- cdt # add vectors to plot (used for 2nd option)\nRun Code Online (Sandbox Code Playgroud)\n在第一个选项中,我使用了plotly_doubleclick. 为了使这项工作正常进行,我修改了p. 我这样做是因为我无法足够快地双击鼠标,以便 Plotly 在没有此参数的情况下注册操作。
p <- ggplotly(gg) %>% config(doubleClickDelay = 1000)\nRun Code Online (Sandbox Code Playgroud)\n保持排队直到点击很快就会变得一团糟。我花了很多时间才发现它的潜在问题。
\np %>% htmlwidgets::onRender( \n "function(el, x) {\n giveMe = Array();\n el.on(\'plotly_hover\', function(p) { /* when hovering add lines */\n tellMe = p.points[0].pointIndex; /* capture scatter index for curve number */\n giveMe.push(tellMe + 1);\n Plotly.restyle(el, {\'visible\': true}, giveMe);\n })\n el.on(\'plotly_doubleclick\', function(p) { /* when unhovering remove lines */\n Plotly.restyle(el, {\'visible\': false}, giveMe);\n giveMe = []; /* clear list after changing to visible = F */\n })\n }")\nRun Code Online (Sandbox Code Playgroud)\n\nplot_click除了悬停方法之外,此版本还使用。当您hover/时unhover,它仍然会显示隐藏线。但是,当您单击数据点时,它将显示每个有线条的点的工具提示。
之前lapply,我创建一个空列表。该列表将存储由线连接的数据的行号,这些线将转换为图中点的索引。
调用后lapply,我将此向量列表添加到第一个跟踪中,作为customdata。这样就可以在 Javascript 中访问这些索引。
以下是通过单击创建自定义特色工具提示的代码。我想指出的是——单击任何地方都行不通,你必须单击你感兴趣的数据点。
\np %>% htmlwidgets::onRender(\n "function(el, x) {\n nms = [\'curveNumber\', \'pointNumber\'];\n el.on(\'plotly_hover\', function(p) { /* when hovering add lines */\n tellMe = p.points[0].pointIndex; /* capture scatter index for curve number */\n Plotly.restyle(el, {\'visible\': true}, [tellMe + 1]);\n })\n el.on(\'plotly_unhover\', function(p) { /* when unhovering remove lines */\n Plotly.restyle(el, {\'visible\': false}, [tellMe + 1]);\n })\n el.on(\'plotly_click\', function(p) {\n var giveMe = p.points[0].customdata; /* get point\'s array of customdata */\n giveMe.push(tellMe); /* add current pointIndex to list */\n hvr = []; /* clear array for curve & point list*/\n for(ea in giveMe) { /* create list for hovering */ \n var oj = {}; oj[nms[0]] = 0; \n oj[nms[1]] = giveMe[ea] + 1; \n hvr.push(oj);\n } \n Plotly.Fx.hover(el, hvr); /* show tooltips for points */\n })\n }")\nRun Code Online (Sandbox Code Playgroud)\n\nlibrary(tidyverse)\nlibrary(plotly)\n\ngg <- ggplot(nodes, aes(x = x, y = y, text = paste0("Selected Jobs: ", name))) +\n geom_jitter(width = 0.2, height = 0.2, size = 1, color = "steelblue") +\n labs(x = "Job Type", y = "Experience Level") +\n theme_minimal() +\n theme(panel.grid = element_blank()) +\n coord_cartesian(clip = "off") +\n theme(plot.margin = margin(20, 20, 20, 20))\n\n# slow click speed required (used with option 1)\np <- ggplotly(gg) %>% config(doubleClickDelay = 1000) \n\n# capture jitter data\ndf3 <- data.frame(x = p$x$data[[1]]$x, y = p$x$data[[1]]$y, \n nm = nodes$name, x1 = nodes$x, y1 = nodes$y)\n\n# create a simulation of jobs that match\nnodes3 <- lapply(1:nrow(nodes), function(k) {\n thisOne <- nodes$name[k]\n mtch <- nodes$name[\n grep(pattern = paste0("^", substr(thisOne, 1, 1)), nodes$name)]\n mtch <- mtch[!mtch %in% thisOne]\n if(length(mtch) < 1) {\n data.frame(occ1 = character(), occ2 = character(), # if no matches\n x = factor(), y = factor())\n } else {\n data.frame(occ1 = rep(thisOne, length(mtch)), occ2 = mtch, # if matches\n x = nodes$x[k], y = nodes$y[k])\n }\n }) %>% bind_rows()\n\ncdt = list() # list for the connected data point indices (used for 2nd option)\n\n# retain order of points in lines\' traces\ninvisible(lapply(1:nrow(df3), function(j) {\n dt <- df3[j, ] # point the lines will originate from\n mtch <- nodes3 %>% \n filter(x == dt$x1, y == dt$y1, occ1 == dt$nm) %>% # matching occ2\n select(occ2) %>% unlist(use.names = F)\n nodes4 <- df3[df3$nm %in% mtch, ] # extract matched x, y positions\n if(nrow(nodes4) < 1) {\n p <<- p %>% # create trace so indices remain correct!\n add_lines(x = rep(df3[j, ]$x, 2), y = rep(df3[j, ]$y, 2), visible = F) # create lines\n return() # if no similar occupations\n }\n # create segment vectors for x and y\n xs <- lapply(1:nrow(nodes4), function(m) {c(dt$x, nodes4[m, ]$x, NA)}) %>% unlist()\n ys <- lapply(1:nrow(nodes4), function(m) {c(dt$y, nodes4[m, ]$y, NA)}) %>% unlist()\n \n # get row numbers of connected data\n vect <- which(df3$x %in% nodes4$x & df3$y %in% nodes4$y)\n cdt[[j]] <<- vect - 1 # 0 ind in JS, so subtract one from every value\n p <<- p %>% \n add_lines(x = xs, y = ys, visible = F) # create lines\n}))\np\n\np$x$data[[1]]$customdata <- cdt # add vectors to plot (used for 2nd option)\n\n#------- Option 1 from update: -------\n# hover to show lines, click to remove lines\n\np %>% htmlwidgets::onRender( \n "function(el, x) {\n giveMe = Array();\n el.on(\'plotly_hover\', function(p) { /* when hovering add lines */\n tellMe = p.points[0].pointIndex; /* capture scatter index for curve number */\n giveMe.push(tellMe + 1);\n Plotly.restyle(el, {\'visible\': true}, giveMe);\n })\n el.on(\'plotly_doubleclick\', function(p) { /* when unhovering remove lines */\n Plotly.restyle(el, {\'visible\': false}, giveMe);\n giveMe = []; /* clear list after changing to visible = F */\n })\n }")\n\n#------- Option 2 from update: -------\n# hover/unhover to show/hide lines; click show tooltips\n\np %>% htmlwidgets::onRender(\n "function(el, x) {\n nms = [\'curveNumber\', \'pointNumber\'];\n el.on(\'plotly_hover\', function(p) { /* when hovering add lines */\n tellMe = p.points[0].pointIndex; /* capture scatter index for curve number */\n Plotly.restyle(el, {\'visible\': true}, [tellMe + 1]);\n })\n el.on(\'plotly_unhover\', function(p) { /* when unhovering remove lines */\n Plotly.restyle(el, {\'visible\': false}, [tellMe + 1]);\n })\n el.on(\'plotly_click\', function(p) {\n var giveMe = p.points[0].customdata; /* get point\'s array of customdata */\n if(giveMe.length < 1) return;\n if(!Array.isArray(giveMe)) giveMe = [giveMe];\n giveMe.push(tellMe); /* add current pointIndex to list */\n hvr = []; /* clear array for curve & point list*/\n for(ea in giveMe) { /* create list for hovering */ \n var oj = {}; oj[nms[0]] = 0; \n oj[nms[1]] = giveMe[ea]; \n hvr.push(oj);\n } \n Plotly.Fx.hover(el, hvr); /* show tooltips for points */\n })\n }")\n\n#------- Original hover/unhover calls in answer -------\n# hover/unhover to show/hide lines\np %>% htmlwidgets::onRender(\n "function(el, x) {\n el.on(\'plotly_hover\', function(p) { /* when hovering add lines */\n tellMe = p.points[0].pointIndex; /* capture scatter index for curve number */\n Plotly.restyle(el, {\'visible\': true}, [tellMe + 1]);\n })\n el.on(\'plotly_unhover\', function(p) { /* when unhovering remove lines */\n Plotly.restyle(el, {\'visible\': false}, [tellMe + 1]);\n })\n }")\n\n\nRun Code Online (Sandbox Code Playgroud)\n注意事项:
\ndputfor 节点。我创建了一组假的连接作业,而不是连接作业。因为如果所有线条始终可见,这将是一团糟,所以我对此进行了修改,以便当您将鼠标悬停时它会创建线条(如您在问题中提供链接的示例图中所示)。
\n\n因为您正在使用ggplot抖动功能,所以每次运行绘图时,它的抖动都会略有不同。为了创建段,请创建对象gg,p就像您已经拥有的那样。(因此抖动位置是永久性的。)
\n\n您可以稍作修改
\ngg,以便事后无需修改悬停文本。而不是text = name使用text = paste0("Selected Jobs: ", name). 在我的代码中,您将看到没有event_registerorprepend(所有这些都被 中的此修改替换ggplot)。
library(plotly)\nlibrary(tidyverse)\n\ngg <- ggplot(nodes, aes(x = x, y = y, text = paste0("Selected Jobs: ", name))) +\n geom_jitter(width = 0.2, height = 0.2, size = 1, color = "steelblue") +\n labs(x = "Job Type", y = "Experience Level") +\n theme_minimal() +\n theme(panel.grid = element_blank()) +\n coord_cartesian(clip = "off") +\n theme(plot.margin = margin(20, 20, 20, 20))\n\n\np <- ggplotly(gg) # create plotly object to get jitter x, y\nRun Code Online (Sandbox Code Playgroud)\n\n从对象中提取抖动数据plotly。这些是x和y,代表散点在图上的位置。
# capture jitter data & combine with nodes data\ndf3 <- data.frame(x = p$x$data[[1]]$x, y = p$x$data[[1]]$y, \n nm = nodes$name, x1 = nodes$x, y1 = nodes$y)\nRun Code Online (Sandbox Code Playgroud)\n我在这里创建了一组假数据来模拟作业之间的连接。您不需要创建此数据,但我将其包含在内是为了重现性。
\n# create a simulation of jobs that match\nnodes3 <- lapply(1:nrow(nodes), function(k) {\n thisOne <- nodes$name[k]\n mtch <- nodes$name[grep(pattern = paste0("^", substr(nodes$name, 1, 1)), nodes$name)]\n mtch <- mtch[!mtch %in% thisOne]\n data.frame(occ1 = rep(thisOne, length(mtch)), occ2 = mtch,\n x = nodes$x[k], y = nodes$y[k])\n}) %>% bind_rows()\nRun Code Online (Sandbox Code Playgroud)\n现在是创建线条的时候了。您将为Nodes数据中的每一行创建一个 Plotly 跟踪。换句话说,绘图上每个散点的一组线。当您将其输入到 Plotly 时,仔细检查行非常重要,创建行的顺序也很重要!消失/重新出现的线的功能是建立在假设线是按照与数据散点相同的顺序创建的基础上的。
我曾经lapply浏览过中的每一行df3(中的行数相同nodes)。使用nmin df3 (name在节点中),我创建的数据被过滤以匹配职业。
\n\n我只使用了职业,但您在问题中确定了其他标准。再说一次,我没有这些数据,所以我无法为您创建这些过滤器。理想情况下,您将创建一个数据集并预先过滤此内容。但是,当您查看此代码时,您将看到我是如何过滤的,并且您也可以在此处更改这些过滤器。
\n
识别出匹配的“点”位置后,我创建代表线段的向量。Plotly 中的线段没有固有的模式。
\n这是 Plotly 中的一个示例。如果我想要 2 个从 (1, 1) 开始并在 (2, 5) 和 (3, 7) 结束的线段,这就是我的 x 和 y 向量的样子
\nx = c(1, 2, NA, 1, 3)
y = c(1, 5, NA, 1, 7)
NA在每个开始位置和结束位置之间放置一个。
由于可能不存在类似的职业,因此我使用 if 语句来查找不匹配的职业。由于可能有许多匹配的职业,因此每个向量(x 和 y)都是通过lapply遍历每个匹配来创建向量而创建的。
一旦确定了 x 和 y,就会创建线条轨迹并将其添加到绘图中。这些痕迹是visible = F。
# retain order of points in lines\' traces\ninvisible(lapply(1:nrow(df3), function(j) {\n dt <- df3[j, ] # the row the lines will originate from\n mtch <- nodes3 %>% \n filter(x == dt$x1, y == dt$y1, occ1 == dt$nm) %>% # extract all matching occ2\n select(occ2) %>% unlist()\n nodes4 <- df3[df3$nm %in% mtch, ] # extract matched x, y positions\n if(nrow(nodes4) < 1) return() # where there are no similar occupations\n xs <- lapply(1:nrow(nodes4), function(m) {c(dt$x, nodes4[m, ]$x, NA)}) %>% unlist()\n ys <- lapply(1:nrow(nodes4), function(m) {c(dt$y, nodes4[m, ]$y, NA)}) %>% unlist()\n\n p <<- p %>% # add lines to plot\n add_lines(x = xs[-(length(xs) - 1)], y = ys[-(length(xs) - 1)], visible = F)\n}))\nRun Code Online (Sandbox Code Playgroud)\n现在是时候添加使线条在您将鼠标悬停在数据点上时出现和消失的功能了。
\n我使用htmlwidgets::onRender()、 Plotly 的事件plotly_hover和plotly_unhover、Plotly.restyle来实现这一点。
当您将鼠标悬停在某个点上时,事件数据包括点索引和曲线编号。曲线编号是迹线的索引。曲线编号也可以用在 中Plotly.restyle。当我创建对象时tellMe,通过忽略它是什么类型的变量,我创建了一个全局变量,从而允许我在另一个函数中使用该值(在一个函数中创建,但在两个函数中使用)。使用迹线索引(+ 1,因为散点是第一条迹线),您将切换该数据点线段的可见性。
p %>% htmlwidgets::onRender(\n "function(el, x) {\n el.on(\'plotly_hover\', function(p) { /* when hovering add lines */\n tellMe = p.points[0].pointIndex; /* capture scatter index for curve number */\n Plotly.restyle(el, {\'visible\': true}, [tellMe + 1]);\n })\n el.on(\'plotly_unhover\', function(p) { /* when unhove
| 归档时间: |
|
| 查看次数: |
219 次 |
| 最近记录: |