与RVest并行进行网络抓取

Lat*_*lia 5 parallel-processing r web-scraping rvest

我想用来rvest抓取网页。它工作正常,但并行执行失败。

library(rvest)
library(dplyr)

LINKS <- read_html("https://stackoverflow.com/") %>% 
    html_nodes(".question-hyperlink") %>% 
    html_attr(name = "href") %>%
  paste("https://stackoverflow.com", ., sep = "")


Get_values <- function(x){

  RES <- read_html(x) %>% 
    html_nodes(".label-key") %>% 
    html_text()
}
Run Code Online (Sandbox Code Playgroud)

工作正常

DATA <- lapply(LINKS[1:10], Get_values) #works fine
Run Code Online (Sandbox Code Playgroud)

返回NULL

library(parallel)
DATA <- mclapply(LINKS[1:10], Get_values, mc.cores = 2) #returns NULL
Run Code Online (Sandbox Code Playgroud)