Nokogiri中的HTML元素之间的空格:: HTML#content

shi*_*las 1 ruby nokogiri

当我跑这个

Nokogiri::HTML('<div class="content"><p>Hello</p><p>Good Sir</p></div>').content
Run Code Online (Sandbox Code Playgroud)

我明白了

"HelloGood Sir"
Run Code Online (Sandbox Code Playgroud)

有没有办法通过Nokogiri的API获得以下内容?

"Hello Good Sir"
Run Code Online (Sandbox Code Playgroud)

Aru*_*hit 6

require 'nokogiri'

doc = Nokogiri::HTML('<div class="content"><p>Hello</p><p>Good Sir</p></div>')

# below will fetch all text nodes irrespective of any tag,from the current document.
doc.xpath("//text()").map(&:text)
# => ["Hello", "Good Sir"]

doc.xpath("//text()").map(&:text).join(" ")
# => "Hello Good Sir"

# below will fetch all text nodes which are wrapped inside the p tag,
# from the current document.
doc.xpath("//p").map(&:text)
# => ["Hello", "Good Sir"]

doc.xpath("//p").map(&:text).join(" ")
# => "Hello Good Sir"
Run Code Online (Sandbox Code Playgroud)