hfi*_*sch 7 r web-crawler rvest
我正在尝试编写一个爬虫来下载一些信息,类似于这个Stack Overflow帖子. 答案对于创建填充表单非常有用,但是当提交按钮不是表单的一部分时,我很难找到提交表单的方法.这是一个例子:
session <- html_session("www.chase.com")
form <- html_form(session)[[3]]
filledform <- set_values(form, `user_name` = user_name, `usr_password` = usr_password)
session <- submit_form(session, filledform)
Run Code Online (Sandbox Code Playgroud)
此时,我收到此错误:
Error in names(submits)[[1]] : subscript out of bounds
Run Code Online (Sandbox Code Playgroud)
如何提交此表单?
Tri*_*tio 10
这是一个适合我的脏黑客:在研究了submit_form源代码之后,我想通过在我的代码版本的表单中注入一个虚假的提交按钮来解决问题,然后submit_form函数会调用它.它可以工作,除了它会发出一个警告,经常列出一个不合适的输入对象(不过在下面的例子中).但是,尽管有警告,代码对我有用:
session <- html_session("www.chase.com")
form <- html_form(session)[[3]]
# Form on home page has no submit button,
# so inject a fake submit button or else rvest cannot submit it.
# When I do this, rvest gives a warning "Submitting with '___'", where "___" is
# often an irrelevant field item.
# This warning might be an rvest (version 0.3.2) bug, but the code works.
fake_submit_button <- list(name = NULL,
type = "submit",
value = NULL,
checked = NULL,
disabled = NULL,
readonly = NULL,
required = FALSE)
attr(fake_submit_button, "class") <- "input"
form[["fields"]][["submit"]] <- fake_submit_button
user_name <- "user"
usr_password <- "password"
filledform <- set_values(form, `user_name` = user_name, `usr_password` = usr_password)
session <- submit_form(session, filledform)
Run Code Online (Sandbox Code Playgroud)
成功的结果显示以下警告,我只是忽略:
> Submitting with 'submit'
Run Code Online (Sandbox Code Playgroud)