use*_*501 2 r web-scraping httr rvest
有人可以帮助我或给我一些建议如何从这个网址抓取表格: https: //www.promet.si/portal/sl/stevci-prometa.aspx。
我尝试使用说明和包rvest、httr 和 html但对于这个特定的站点没有任何成功。谢谢。
这应该可以帮助您开始:
\n\nlibrary(RSelenium)\nlibrary(wdman)\nlibrary(seleniumPipes)\nlibrary(rvest)\nlibrary(tidyverse)\n\nselServ <- selenium(verbose = FALSE)\nselServ$log() # find the port\nremDr <- remoteDr(browserName = "chrome", port = 4567L)\n\nremDr %>% \n go("https://www.promet.si/portal/sl/stevci-prometa.aspx")\n\nSys.sleep(5)\n\npg <- getPageSource(remDr)\n\nhtml_node(pg, xpath=".//div[@id=\'ctl00_mainContent_ctl00_StvContainer\']/table") %>% \n html_table() %>% \n tbl_df()\n## # A tibble: 1,239 x 10\n## X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 \n## <lgl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <lgl>\n## 1 NA Lokacija Cesta Smer Pas \xc5\xa0tevil\xe2\x80\xa6 Hitro\xe2\x80\xa6 Razm\xe2\x80\xa6 Stanje NA \n## 2 NA Ajdov\xc5\xa1\xc4\x8dina R2-444 vzhod - zahod "" 60 64 81,7 Norma\xe2\x80\xa6 NA \n## 3 NA Ajdov\xc5\xa1\xc4\x8dina R2-444 zahod - vzhod "" 12 62 371,6 Norma\xe2\x80\xa6 NA \n## 4 NA Ajdov\xc5\xa1\xc4\x8dina 2 R2-444 Ajdov\xc5\xa1\xc4\x8dina - Selo "" 36 67 117,8 Norma\xe2\x80\xa6 NA \n## 5 NA Ajdov\xc5\xa1\xc4\x8dina 2 R2-444 Ajdov\xc5\xa1\xc4\x8dina - Selo "" 12 60 787,1 Norma\xe2\x80\xa6 NA \n## 6 NA Ajdov\xc5\xa1\xc4\x8dina AC HC-H4 Nova Gorica - Vipava vozni 96 100 31,5 Norma\xe2\x80\xa6 NA \n## 7 NA Ajdov\xc5\xa1\xc4\x8dina AC HC-H4 Nova Gorica - Vipava prehi\xe2\x80\xa6 36 124 120,7 Norma\xe2\x80\xa6 NA \n## 8 NA Ankaran R2-406 Kri\xc5\xbe. Moretini - Ankaran "" 96 59 29 Norma\xe2\x80\xa6 NA \n## 9 NA Ankaran R2-406 Ankaran - Kri\xc5\xbe. Moretini "" 12 57 292,1 Norma\xe2\x80\xa6 NA \n## 10 NA Apa\xc4\x8de R2-438 Trate - Gornja Radgona "" 24 58 110,6 Norma\xe2\x80\xa6 NA \n## # ... with 1,229 more rows\nRun Code Online (Sandbox Code Playgroud)\n