使用 R 从 php 网站读取数据

Question

使用 R 从 php 网站读取数据

ath*_*ats 4 php import r web-scraping

我想从这样的表中将数据导入R：

http://www.rout.gr/index.php?name=Rout&file=results&year=2011

我尝试按照以下线程的建议使用 XML 库，但我什么也没得到。

使用 XML 包将 html 表抓取到 R 数据框

Answer 1

sea*_*ody 5

该网站似乎确实发生了一些时髦的事情。除非您伪造用户代理，否则它似乎不会返回任何数据。即便如此， readHTMLTable 的表现也不太好，如果将整个doc. 阅读源代码后，您可以看到相关表具有 idtable_results_r_1并将其隔离并通过工作传递结果：

library(XML)
library(httr)

theurl <- "http://www.rout.gr/index.php?name=Rout&file=results&year=2011"
doc <- htmlParse(GET(theurl, user_agent("Mozilla")))
results <- xpathSApply(doc, "//*/table[@id='table_results_r_1']")
results <- readHTMLTable(results[[1]])
rm(doc)

Run Code Online (Sandbox Code Playgroud)

现在您需要整理表列名称。

归档时间：	13 年，3 月前
查看次数：	2936 次
最近记录：	10 年前