根据XSD架构验证XML并使用groovy捕获验证器预期

kan*_*ri3 2 xml validation groovy xsd exception-handling

def validateXml(xml){

    String xsd = "src/main/ressources/fulltext-documents-v1.2.3.xsd"

    def factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
    def schema = factory.newSchema(new StreamSource(new File(xsd)))
    def validator = schema.newValidator()
    validator.validate(new StreamSource(new StringReader(xml)))
}
Run Code Online (Sandbox Code Playgroud)

这是我验证xml文档的String表示的函数.下面是另一个捕获验证程序可能引发的异常的函数

def xmlVerification(xml) {

    Node rootNode = new XmlParser().parseText(xml)
    def stringXml = XmlUtil.serialize(rootNode)

    try{
        validateXml(stringXml)
        println "no error in text"
    }catch(SAXParseException e){
        println "column number "+e.getColumnNumber()
        println "line number"+e.getLineNumber()
    }
}
Run Code Online (Sandbox Code Playgroud)

现在它只显示引发异常的列和行号(目前对我来说足够好).

现在,让我们假设我有一个至少有2个错误的文档.我希望得到这两个错误(例如在一个表中),然后对待它们.使用我的代码,它会在第一个异常引发时停止,因此我无法处理2个错误.我必须更正第一个以纠正第二个(通过第二次重新运行我的代码).

任何想法如何我可以浏览整个文档,存储所有异常并在.each {}循环中处理它们或类似的东西?

希望它足够清楚

提前致谢 !

tim*_*tes 6

这应该做你想要的:

import org.xml.sax.ErrorHandler
import static javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.Schema
import javax.xml.validation.SchemaFactory
import javax.xml.validation.Validator

List findProblems( File xml, File xsd ) {
  SchemaFactory factory = SchemaFactory.newInstance( W3C_XML_SCHEMA_NS_URI )
  Schema schema = factory.newSchema( new StreamSource( xsd ) )
  Validator validator = schema.newValidator()
  List exceptions = []
  Closure<Void> handler = { exception -> exceptions << exception }
  validator.errorHandler = [ warning:    handler,
                             fatalError: handler,
                             error:      handler ] as ErrorHandler
  validator.validate( new StreamSource( xml ) )
  exceptions
}

// Two files I got for testing
File xml = new File( 'books.xml' )
File xsd = new File( 'books.xsd' )

// Call the method, and print out each exception
findProblems( xml, xsd ).each {
  println "Problem @ line $it.lineNumber, col $it.columnNumber : $it.message"
}
Run Code Online (Sandbox Code Playgroud)

或者稍微更具意识形态的groovy版本将是:

import org.xml.sax.ErrorHandler
import static javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.SchemaFactory

List findProblems( File xml, File xsd ) {
  SchemaFactory.newInstance( W3C_XML_SCHEMA_NS_URI )
               .newSchema( new StreamSource( xsd ) )
               .newValidator().with { validator ->
    List exceptions = []
    Closure<Void> handler = { exception -> exceptions << exception }
    errorHandler = [ warning: handler, fatalError: handler, error: handler ] as ErrorHandler
    validate( new StreamSource( xml ) )
    exceptions
  }
}
Run Code Online (Sandbox Code Playgroud)

  • @ kanadianDri3啊,是啊`Closure <Void> handler`只是意味着`handler`是一个不返回任何内容的Closure.如果需要,可以将其更改为"def".而Closure定义`{exception - > exceptions << exception}`意味着_"一个闭包,它接受一个名为`exception`的参数,并在调用时将该对象添加到列表`exceptions`中__左边的位 - >`定义参数,右边的位定义方法体. (2认同)