使用时检索h1标题时rvest,我有时会遇到404页.这将停止该过程并返回此错误.
open.connection(x,"rb")出错:HTTP错误404.
请参阅下面的示例
Data<-data.frame(Pages=c(
"http://boingboing.net/2016/06/16/spam-king-sanford-wallace.html",
"http://boingboing.net/2016/06/16/omg-the-japanese-trump-commer.html",
"http://boingboing.net/2016/06/16/omar-mateen-posted-to-facebook.html",
"http://boingboing.net/2016/06/16/omar-mateen-posted-to-facdddebook.html"))
Run Code Online (Sandbox Code Playgroud)
用于检索h1的代码
library (rvest)
sapply(Data$Pages, function(url){
url %>%
as.character() %>%
read_html() %>%
html_nodes('h1') %>%
html_text()
})
Run Code Online (Sandbox Code Playgroud)
有没有办法包含一个参数来忽略错误并继续这个过程?
我正在尝试复制 rstudio 页面上提供的此示例。
问题是我需要对as.Date %d/%m/%Yx 轴使用日期 ( ) 并且在缩放绘图时出现此错误
无效输入:date_trans 仅适用于 Date 类的对象
library(ggplot2)
library(scales)
library (grid)
library (DT)
ui <- fluidPage(
fluidRow(
column(width = 12, class = "well", h4("Left plot controls right plot"),
fluidRow(column(width = 6, plotOutput("plot1", height = 300,
brush = brushOpts(
id = "plot2_brush",
esetOnNew = TRUE)))
,
column(width = 6,plotOutput("plot2", height = 300,
click = "plot_click",
dblclick = dblclickOpts(
id = "plot_dblclick")))
,
fluidRow(column(width = 12, dataTableOutput("selected_rows")))
))))
Date <- c("01/01/2014","01/01/2014","01/01/2014","01/01/2014")
Sevdow <- …Run Code Online (Sandbox Code Playgroud) 我试图在闪亮的应用程序上使用文本输入小部件来过滤数据框中的行,但我无法让它工作。
数据集
df1<-data.frame (Name=c("Carlos","Pete","Carlos","Carlos","Carlos","Pete","Pete","Pete","Pete","Homer"),Sales=(as.integer(c("3","4","7","6","4","9","1","2","1","9"))))
Run Code Online (Sandbox Code Playgroud)
用户界面
shinyUI(fluidPage(
titlePanel("Sales trends"),titlePanel("People score"),
sidebarLayout(sidebarPanel(
textInput("text", label = h3("Text input"), value = "Enter text..."),
numericInput("obs", "Number of observations to view:", 3),
helpText("Note: while the data view will show only the specified",
"number of observations, the summary will still be based",
"on the full dataset."),
submitButton("Update View")
),
mainPanel(
h4("Volume: Total sales"),
verbatimTextOutput("volume"),
h4("Top people"),
tableOutput("view")
))))
Run Code Online (Sandbox Code Playgroud)
服务器
library(shiny)
library (dplyr)
df1<-data.frame (Name=c("Carlos","Pete","Carlos","Carlos","Carlos","Pete","Pete","Pete","Pete","Homer"),Sales=(as.integer(c("3","4","7","6","4","9","1","2","1","9"))))
shinyServer(function(input, output) {
output$value <- renderPrint({ input$text })
datasetInput <- reactive({
switch(input$dataset,df1%>% filter(Name …Run Code Online (Sandbox Code Playgroud) 您好我正在尝试检索这些wepages元描述
从页面来源"
Data<-data.frame(Pages=c(
"http://boingboing.net/2016/06/16/spam-king-sanford-wallace.html",
"http://boingboing.net/2016/06/16/omg-the-japanese-trump-commer.html",
"http://boingboing.net/2016/06/16/omar-mateen-posted-to-facebook.html"))
Run Code Online (Sandbox Code Playgroud)
期望的输出
Data$Meta_Description<-data.frame(Extracted=c(
"Sanford Wallace gets 2.5 years in prison for 27 million Facebook",
"OMG, this Japanese Trump Commercial is everything",
"Omar Mateen posted to Facebook during Orlando mass shooting"))
Run Code Online (Sandbox Code Playgroud)
我试图用httr来完成这个任务但是我无法以所需的输出格式获取它或者从使用GET命令检索的内容中提取内容
library (httr)
resp<-GET ("http://boingboing.net/2016/06/16/spam-king-sanford-wallace.html")
str(resp)
List of 10
$ url : chr "http://boingboing.net/2016/06/16/spam-king-sanford-wallace.html"
$ status_code: int 200
$ headers :List of 22
..$ server : chr "Apache/2.2"
Run Code Online (Sandbox Code Playgroud)
我需要从源代码中提取的字段在此字符串之后
<meta itemprop="description" content="
Run Code Online (Sandbox Code Playgroud)
像这样
<meta itemprop="description" content="'Spam King'
Sanford Wallace gets 2.5 years in prison for …Run Code Online (Sandbox Code Playgroud) 我需要每天在文件夹中生成许多csv文件和故事,文件名中包含处理时间。
我试图将系统时间附加到文件名中,但无法使用paste0做到这一点
write.csv(output, paste0("C://Users/My Computer/dir", Sys.time(), ".csv"))
Run Code Online (Sandbox Code Playgroud)
是否可以在文件中包含系统时间,还是这些文件的用户更好地找到了按修改日期读取这些文件的功能?