使用 R 抓取“aspx”页面

use*_*501 2 r web-scraping httr rvest

有人可以帮助我或给我一些建议如何从这个网址抓取表格: https: //www.promet.si/portal/sl/stevci-prometa.aspx

我尝试使用说明和包rvesthttr 和 html但对于这个特定的站点没有任何成功。谢谢。

hrb*_*str 5

这应该可以帮助您开始:

\n\n
library(RSelenium)\nlibrary(wdman)\nlibrary(seleniumPipes)\nlibrary(rvest)\nlibrary(tidyverse)\n\nselServ <- selenium(verbose = FALSE)\nselServ$log() # find the port\nremDr <- remoteDr(browserName = "chrome", port = 4567L)\n\nremDr %>% \n  go("https://www.promet.si/portal/sl/stevci-prometa.aspx")\n\nSys.sleep(5)\n\npg <- getPageSource(remDr)\n\nhtml_node(pg, xpath=".//div[@id=\'ctl00_mainContent_ctl00_StvContainer\']/table") %>% \n  html_table() %>% \n  tbl_df()\n## # A tibble: 1,239 x 10\n##    X1    X2            X3     X4                       X5     X6      X7     X8    X9     X10  \n##    <lgl> <chr>         <chr>  <chr>                    <chr>  <chr>   <chr>  <chr> <chr>  <lgl>\n##  1 NA    Lokacija      Cesta  Smer                     Pas    \xc5\xa0tevil\xe2\x80\xa6 Hitro\xe2\x80\xa6 Razm\xe2\x80\xa6 Stanje NA   \n##  2 NA    Ajdov\xc5\xa1\xc4\x8dina    R2-444 vzhod - zahod            ""     60      64     81,7  Norma\xe2\x80\xa6 NA   \n##  3 NA    Ajdov\xc5\xa1\xc4\x8dina    R2-444 zahod - vzhod            ""     12      62     371,6 Norma\xe2\x80\xa6 NA   \n##  4 NA    Ajdov\xc5\xa1\xc4\x8dina 2  R2-444 Ajdov\xc5\xa1\xc4\x8dina - Selo        ""     36      67     117,8 Norma\xe2\x80\xa6 NA   \n##  5 NA    Ajdov\xc5\xa1\xc4\x8dina 2  R2-444 Ajdov\xc5\xa1\xc4\x8dina - Selo        ""     12      60     787,1 Norma\xe2\x80\xa6 NA   \n##  6 NA    Ajdov\xc5\xa1\xc4\x8dina AC HC-H4  Nova Gorica - Vipava     vozni  96      100    31,5  Norma\xe2\x80\xa6 NA   \n##  7 NA    Ajdov\xc5\xa1\xc4\x8dina AC HC-H4  Nova Gorica - Vipava     prehi\xe2\x80\xa6 36      124    120,7 Norma\xe2\x80\xa6 NA   \n##  8 NA    Ankaran       R2-406 Kri\xc5\xbe. Moretini - Ankaran ""     96      59     29    Norma\xe2\x80\xa6 NA   \n##  9 NA    Ankaran       R2-406 Ankaran - Kri\xc5\xbe. Moretini ""     12      57     292,1 Norma\xe2\x80\xa6 NA   \n## 10 NA    Apa\xc4\x8de         R2-438 Trate - Gornja Radgona   ""     24      58     110,6 Norma\xe2\x80\xa6 NA   \n## # ... with 1,229 more rows\n
Run Code Online (Sandbox Code Playgroud)\n