Thu*_*fir 0 java xml dom saxon xml-parsing
直接来自手册:
将 DOM 写为 XML 文件
构建 DOM(通过解析 XML 文件或以编程方式构建)后,您经常希望将其保存为 XML。本节向您展示如何使用 Xalan 转换包来做到这一点。
使用该包,您将创建一个转换器对象以将 DOMSource 连接到 StreamResult。然后,您将调用转换器的transform() 方法将DOM 作为XML 数据写出。
我的输出:
thufir@dur:~/NetBeansProjects/helloWorldSaxon$
thufir@dur:~/NetBeansProjects/helloWorldSaxon$ gradle clean run
> Task :run
Jan 04, 2019 3:28:24 PM helloWorldSaxon.HandlerForXML createDocumentFromURL
INFO: http://books.toscrape.com/
Jan 04, 2019 3:28:26 PM helloWorldSaxon.HandlerForXML createDocumentFromURL
INFO: javax.xml.transform.dom.DOMResult@3cda1055
Jan 04, 2019 3:28:26 PM helloWorldSaxon.HandlerForXML createDocumentFromURL
INFO: html
BUILD SUCCESSFUL in 2s
4 actionable tasks: 4 executed
thufir@dur:~/NetBeansProjects/helloWorldSaxon$
Run Code Online (Sandbox Code Playgroud)
首先,我想要更有意义的输出,了解其内容domResult、外观或包含内容。我认为更重要的是document下面的迭代或遍历:
public void createDocumentFromURL() throws SAXException, IOException, TransformerException, ParserConfigurationException {
LOG.info(url.toString());
TransformerFactory transformerFactory = TransformerFactory.newInstance();
XMLReader xmlReader = XMLReaderFactory.createXMLReader("org.ccil.cowan.tagsoup.Parser");
Source source = new SAXSource(xmlReader, new InputSource(url.toString()));
DOMResult domResult = new DOMResult();
Transformer transformer = transformerFactory.newTransformer();
transformer.transform(source, domResult); //how do I find the result of this operation?
LOG.info(domResult.toString()); //traverse or iterate how?
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
// Document document = documentBuilder.parse(); ///bzzzt, wrong
Document document = (Document) domResult.getNode();
LOG.info(document.getDocumentElement().getTagName());
}
Run Code Online (Sandbox Code Playgroud)
输出是“html”让我相信这是html. 所需的输出是html,但来自 a Document,而不是 a String。
Oracle文档上写了一个DOM就是解析文档。这个文档还没有被解析吗? 或者,换句话说,我如何确定它是或XML根本不是文件?
那么……再改造一下吗?
也可以看看:
您实际上只需将 DOM 转换为您的文件即可。
例子
// Create DOM
Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element root = document.createElement("Root");
document.appendChild(root);
Element foo = document.createElement("Foo");
foo.appendChild(document.createTextNode("Bar"));
root.appendChild(foo);
Run Code Online (Sandbox Code Playgroud)
您可以将该 DOM 保存到如下文件中:
// Write DOM to file as XML
File xmlFile = new File("/path/to/file.xml");
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(document), new StreamResult(xmlFile));
Run Code Online (Sandbox Code Playgroud)
你也可以像这样打印 DOM:
// Print DOM as XML
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(document), new StreamResult(System.out));
Run Code Online (Sandbox Code Playgroud)
输出
<?xml version="1.0" encoding="UTF-8" standalone="no"?><Root><Foo>Bar</Foo></Root>
Run Code Online (Sandbox Code Playgroud)
如果您想要 XML 格式:
// Print DOM as formatted XML
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(new DOMSource(document), new StreamResult(System.out));
Run Code Online (Sandbox Code Playgroud)
输出
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Root>
<Foo>Bar</Foo>
</Root>
Run Code Online (Sandbox Code Playgroud)