如何在Ruby中获取HTML图像标记属性值?

Ash*_*sik 2 ruby parsing

我有这个HTML代码:

<img src="../../../media/test.jpg" alt="test" />
Run Code Online (Sandbox Code Playgroud)

但只想这样:

"../../../media/test.jpg"
Run Code Online (Sandbox Code Playgroud)

我怎样才能在Ruby中获得这个?

Aru*_*hit 8

运用 Nokogiri

require 'nokogiri'

doc = Nokogiri::XML::DocumentFragment.parse <<-end
<img src="../../../media/test.jpg" alt="test" />
end
node = doc.at_css('img')
# => #(Element:0x49a28e8 {
#      name = "img",
#      attributes = [
#        #(Attr:0x49a2da2 { name = "src", value = "../../../media/test.jpg" }),
#        #(Attr:0x49a2e24 { name = "alt", value = "test" })]
#      })
node.attributes 
# => {"src"=>
#      #(Attr:0x50324ba { name = "src", value = "../../../media/test.jpg" }),
#     "alt"=>#(Attr:0x50324b0 { name = "alt", value = "test" })}
node.keys
# => ["src", "alt"]
node.values
# => ["../../../media/test.jpg", "test"]
node['src']
# => "../../../media/test.jpg"
node['alt']
# => "test"
Run Code Online (Sandbox Code Playgroud)

如果要删除属性alt,可以执行以下操作:

node.delete('alt')
node
# => #(Element:0x49a28e8 {
#      name = "img",
#      attributes = [
#        #(Attr:0x49a2da2 { name = "src", value = "../../../media/test.jpg" })]
#      })
node.values
# => ["../../../media/test.jpg"]
Run Code Online (Sandbox Code Playgroud)

  • 要真正得到答案,我想你想要`node ['src']`.那将返回`"../../../ media/test.jpg"`. (2认同)
  • @AshekurRahmanMollaAsik是的,`node.values`返回一个数组,但这不是最终答案.Priti表明你可以使用`node ['src']`直接得到答案,而不是数组.此外,您应该知道要获取Ruby数组中的第一项,您可以使用`theArray [0]`或`theArray.first`. (2认同)