从HTML输出Groovy pretty print XmlSlurper?

Миш*_*лев 4 html groovy xmlslurper

我使用几个不同的版本来执行此操作但似乎都导致此错误:

[致命错误]:1:171:前缀"xmlns"无法明确绑定到任何名称空间; "xmlns"的命名空间也不能明确地绑定到任何前缀.

我加载html为:

// Load html file
def fis=new FileInputStream("2.html")
def html=new XmlSlurper(new  org.cyberneko.html.parsers.SAXParser()).parseText(fis.text)        
Run Code Online (Sandbox Code Playgroud)

我试过的版本:

http://johnrellis.blogspot.com/2009/08/hmmm_04.html

import groovy.xml.StreamingMarkupBuilder
import groovy.xml.XmlUtil
def streamingMarkupBuilder=new StreamingMarkupBuilder()
println XmlUtil.serialize(streamingMarkupBuilder.bind{mkp.yield html})
Run Code Online (Sandbox Code Playgroud)

http://old.nabble.com/How-to-print-XmlSlurper%27s-NodeChild-with-indentation--td16857110.html

// Output
import groovy.xml.MarkupBuilder
import groovy.xml.StreamingMarkupBuilder
import groovy.util.XmlNodePrinter
import groovy.util.slurpersupport.NodeChild

def printNode(NodeChild node) {
    def writer = new StringWriter()
    writer << new StreamingMarkupBuilder().bind {
      mkp.declareNamespace('':node[0].namespaceURI())
      mkp.yield node
    }
    new XmlNodePrinter().print(new XmlParser().parseText(writer.toString()))
}
Run Code Online (Sandbox Code Playgroud)

有什么建议?

谢谢!米莎

Миш*_*лев 5

问题是名称空间.这是解决方案:

def saxParser=new org.cyberneko.html.parsers.SAXParser()
saxParser.setFeature('http://xml.org/sax/features/namespaces',false)
new XmlSlurper(saxParser).parseText(text)    

import groovy.xml.XmlUtil
println XmlUtil.serialize(new StreamingMarkupBuilder().bind {
                mkp.yield page
              })
Run Code Online (Sandbox Code Playgroud)

谢谢!米莎