Mec*_*ail 6 python xml xhtml lxml
Python库lxml似乎提供了几个用于生成HTML文档的构建器.这些有什么区别?
但这些生成纯HTML,而不是XHTML.虽然我可以手动添加xmlns声明,但这是不优雅的.那么用lxml 生成X HTML文档的推荐方法是什么?
lxml.builder.E来自http://lxml.de/tutorial.html#the-efactory的示例:
>>> from lxml.builder import E
>>> def CLASS(*args): # class is a reserved word in Python
... return {"class":' '.join(args)}
>>> html = page = (
... E.html( # create an Element called "html"
... E.head(
... E.title("This is a sample document")
... ),
... E.body(
... E.h1("Hello!", CLASS("title")),
... E.p("This is a paragraph with ", E.b("bold"), " text in it!"),
... E.p("This is another paragraph, with a", "\n ",
... E.a("link", href="http://www.python.org"), "."),
... E.p("Here are some reserved characters: <spam&egg>."),
... etree.XML("<p>And finally an embedded XHTML fragment.</p>"),
... )
... )
... )
Run Code Online (Sandbox Code Playgroud)
lxml.html.builder来自http://lxml.de/lxmlhtml.html#creating-html-with-the-efactory的示例:
>>> from lxml.html import builder as E
>>> from lxml.html import usedoctest
>>> html = E.HTML(
... E.HEAD(
... E.LINK(rel="stylesheet", href="great.css", type="text/css"),
... E.TITLE("Best Page Ever")
... ),
... E.BODY(
... E.H1(E.CLASS("heading"), "Top News"),
... E.P("World News only on this page", style="font-size: 200%"),
... "Ah, and here's some more text, by the way.",
... lxml.html.fromstring("<p>... and this is a parsed fragment ...</p>")
... )
... )
Run Code Online (Sandbox Code Playgroud)
Python 库 lxml 似乎提供了多个用于生成 HTML 文档的构建器。这些有什么区别?
lxml.builder.E正在使用工厂模式
从 lxml.html 将构建器导入为 E
从 lxml.html 导入 usedoctest
html = E.HTML(
E.头(
E.LINK(rel="stylesheet", href="great.css", type="text/css"),
E.TITLE(“有史以来最好的页面”)
),
电子体(
E.H1(E.CLASS("标题"), "头条新闻"),
EP("本页仅显示世界新闻", style="font-size: 200%"),
“啊,顺便说一下,这里还有一些文字。”,
lxml.html.fromstring("...这是一个已解析的片段...
”)
)
lxml.builder 使用原型模式:
从 lxml.builder 导入 E
def CLASS(*args): # class是Python中的保留字
返回 {"class":' '.join(args)}
html = 页面 = (
E.html( # 创建一个名为“html”的元素
E.头(
E.title("这是一个示例文档")
),
E.body(
E.h1("你好!", CLASS("标题")),
Ep("这是一个带有 ", Eb("bold"), " 文本的段落!"),
Ep("这是另一段,有一个", "\n ",
Ea("链接", href="http://www.python.org"), "."),
Ep("这里有一些保留字符:."),
etree.XML("最后是嵌入的 XHTML 片段。
”),
)
)
)
虽然我可以手动添加 xmlns 声明,但这是不优雅的。
XSLT 将是另一种选择。
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" encoding="utf-8" version="" indent="yes" standalone="no" media-type="text/html" omit-xml-declaration="no" doctype-system="about:legacy-compat" />
<xsl:template match="/">
<html xmlns="http://www.w3.org/1999/xhtml">
<xsl:copy-of select="."/>
</html>
</xsl:template>
</xsl:stylesheet>
Run Code Online (Sandbox Code Playgroud)
参考