我无法从 Open::URI 的 rdoc 中得知当我这样做时返回了什么:
result = open(url)
Run Code Online (Sandbox Code Playgroud)
URL 返回 XML,但我如何查看/解析 XML?
我正在测试一个方法如何处理302 HTTPError异常.我试图将一个方法调用存根以编程方式引发一个,但是它继续抱怨错误的参数数量错误(0表示2)
代码测试了这个特定的行:
document = Nokogiri.HTML open(source_url)
Run Code Online (Sandbox Code Playgroud)
在规范中,我将其描述为:
subject.stub(:open).and_raise(OpenURI::HTTPError)
subject.should_receive(:ended=).with(true)
subject.update_from_remote
Run Code Online (Sandbox Code Playgroud)
我不认为它与Nokogiri.HTML()或Open-uri.open()有关,为什么会发生这种情况呢?
另外,我如何尝试将此HTTPError作为302重定向错误?谢谢
require 'open-uri'
require 'json'
require 'nokogiri'
doc = Nokogiri::HTML(open("http://www.highcharts.com/demo/"))
puts doc
Run Code Online (Sandbox Code Playgroud)
但我希望能够从这个网页中提取json,使用正则表达式似乎不起作用,以及如何通过XPath提取JSON?
我知道有些语言有一个库,允许您获取404或500消息的HTTP内容.
是否有一个允许Ruby的库?
我尝试过open-uri但它只返回一个HTTPError异常而没有404响应的HTML内容.
我正在尝试屏蔽包含特殊字符(如丹麦字符)的URL 'ø'.
URL是:
url = "http://www.zara.com/dk/da/dame/tilbehør/tilbehør/stribet-hue-c271008p2195502.html"
Run Code Online (Sandbox Code Playgroud)
为了让OpenURI将其识别为有效的URL,我这样做:
url = Addressable::URI.parse(url).normalize.to_s
Run Code Online (Sandbox Code Playgroud)
并解析它:
doc = Nokogiri::HTML(open(url))
Run Code Online (Sandbox Code Playgroud)
返回:
OpenURI::HTTPError: 404 Not Found
Run Code Online (Sandbox Code Playgroud)
我不知道为什么OpenURI返回404,因为规范化的URL在浏览器中工作正常.
为什么会这样,我需要做些什么来解决它?
我已经看过很多open-uri的例子,对于简单的事情看起来非常棒.但是,要求它定义一个open在全局范围内命名的方法,这让我感到困扰.
这是特别令人不安的,因为在Rails 5控制台中探索后,似乎已经open定义了一个名为的方法:
irb(main):001:0> open
ArgumentError: wrong number of arguments (given 0, expected 1..3)
from (irb):1:in `initialize'
from (irb):1:in `open'
from (irb):1
from /Users/ahamon/.gem/ruby/2.3.0/gems/railties-5.0.0.beta3/lib/rails/commands/console.rb:65:in `start'
from /Users/ahamon/.gem/ruby/2.3.0/gems/railties-5.0.0.beta3/lib/rails/commands/console_helper.rb:9:in `start'
from /Users/ahamon/.gem/ruby/2.3.0/gems/railties-5.0.0.beta3/lib/rails/commands/commands_tasks.rb:78:in `console'
from /Users/ahamon/.gem/ruby/2.3.0/gems/railties-5.0.0.beta3/lib/rails/commands/commands_tasks.rb:49:in `run_command!'
from /Users/ahamon/.gem/ruby/2.3.0/gems/railties-5.0.0.beta3/lib/rails/command.rb:20:in `run'
from /Users/ahamon/.gem/ruby/2.3.0/gems/railties-5.0.0.beta3/lib/rails/commands.rb:18:in `<top (required)>'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:302:in `require'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:302:in `block in require'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:268:in `load_dependency'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:302:in `require'
from /Users/ahamon/code/signist/bin/rails:9:in `<top (required)>'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:296:in `load'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:296:in `block in load'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:268:in `load_dependency'
from /Users/ahamon/.gem/ruby/2.3.0/gems/activesupport-5.0.0.beta3/lib/active_support/dependencies.rb:296:in `load'
from /Users/ahamon/.gem/ruby/2.3.0/gems/spring-1.6.4/lib/spring/commands/rails.rb:6:in `call' …Run Code Online (Sandbox Code Playgroud) 我正在使用Nokogiri打开关于各个国家的维基百科页面,然后从interwiki链接中提取其他语言的这些国家的名称(链接到外语wikipedias).但是,当我尝试打开法国页面时,Nokogiri不会下载整页.也许它太大了,无论如何它不包含我需要的interwiki链接.我怎么强迫它下载所有?
这是我的代码:
url = "http://en.wikipedia.org/wiki/" + country_name
page = nil
begin
page = Nokogiri::HTML(open(url))
rescue OpenURI::HTTPError=>e
puts "No article found for " + country_name
end
language_part = page.css('div#p-lang')
Run Code Online (Sandbox Code Playgroud)
测试:
with country_name = "France"
=> []
with country_name = "Thailand"
=> really long array that I don't want to quote here,
but containing all the right data
Run Code Online (Sandbox Code Playgroud)
也许这个问题超越了Nokogiri并进入OpenURI - 无论如何我需要找到一个解决方案.
我一直在尝试为我们的内部Rails应用程序使用SSL(SAN)证书.
我使用OpenSSL创建了证书文件,并让它们与Apache很好地协同工作:
<VirtualHost *:443>
ServerName server-name
RailsEnv uat
DocumentRoot /var/www/server-name/current/public
<Directory /var/www/server-name/current/public>
AllowOverride All
Options -MultiViews
</Directory>
SSLEngine on
SSLCertificateFile /etc/apache2/ssl/hostname.cer
SSLCertificateKeyFile /etc/apache2/ssl/hostname.key
</VirtualHost>
Run Code Online (Sandbox Code Playgroud)
这很好用.
我的问题是当使用open-uri与某些Rails控制器通信时,我收到错误消息:
require "net/https"
uri = URI.parse("https://server-name.domain.com/controller.json?date=2013-09-03")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.start {
http.request_get(uri.path) {|res|
print res.body
}
}
OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
Run Code Online (Sandbox Code Playgroud)
我看过许多StackOverflow文章建议只需关闭SSL验证,使用:
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
Run Code Online (Sandbox Code Playgroud)
但我不认为这是一种令人满意的方法.我也看到这表明我需要指出的文章open-uri,以/etc/ssl/certs/ca-certificates.crt这显然是公共证书.所以对于我的自签名证书,我尝试了以下内容:
uri = URI.parse("https://server-name.domain.com/controller.json?date=2013-09-03")
http = Net::HTTP.new(uri.host, uri.port)
if uri.scheme == "https" …Run Code Online (Sandbox Code Playgroud) 我是第一次与Nokogiri合作并搜索HTML文档.当我创建一个等于的变量(和print)时:
beteween Nokogiri::HTML(open(url).read)
Run Code Online (Sandbox Code Playgroud)
它似乎输出完全相同的东西
beteween Nokogiri::HTML(open(url))
Run Code Online (Sandbox Code Playgroud)
有区别吗?
我在文档中找不到答案,试图看看我是否能找出差异,但遇到了麻烦.
我收到错误:
write': "\xCF" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
Run Code Online (Sandbox Code Playgroud)
从行:
open(uri) {|url_file| tempfile.write(url_file.read)}
Run Code Online (Sandbox Code Playgroud)
相关代码是:
require 'tempfile'
require 'open-uri'
require 'uri'
..
uri = URI.parse(@download_link)
tempfile = Tempfile.create(file_name)
open(uri) {|url_file| tempfile.write(url_file.read)}`
..
Run Code Online (Sandbox Code Playgroud)
如果我像运行它一样运行完全正常ruby lib/file.rb,但是当我在rails环境中运行它时会出错:rails runner lib/file.rb.
此错误的大多数问题都涉及gem安装方案.我猜我必须包含/更新一些宝石,但不知道哪个.