我正在尝试从页面下载内容,并且我发现响应数据格式不正确或不完整,就好像GET或getURL在加载这些数据之前一样.
library(httr)
library(RCurl)
url <- "https://www.vanguardcanada.ca/individual/etfs/etfs.htm"
d1 <- GET(url) # This shows a lot of {{ moustache style }} code that's not filled
d2 <- getURL(url) # This shows "" as if it didn't get anything
Run Code Online (Sandbox Code Playgroud)
我不知道该怎么办.我的目标是获取与浏览器中显示的链接相关联的数字:
https://www.vanguardcanada.ca/individual/etfs/etfs-detail-overview.htm?portId=9548
Run Code Online (Sandbox Code Playgroud)
所以在这种情况下,我想下载并刮掉'9548'.
不确定为什么getURL和GET似乎与浏览器中显示的结果大相径庭.似乎数据加载缓慢,几乎就像GET和getURL在完全加载之前一样.
例如,看看:
x <- "https://www.vanguardcanada.ca/individual/etfs/etfs-detail-prices.htm?portId=9548"
readHTMLTable(htmlParse(GET(x)))
Run Code Online (Sandbox Code Playgroud)