使用AWK打印XML元素

Ada*_*tan 3 xml awk

如何使用AWK打印XML元素的内容 - 从起始标记到结束标记?

例如,请考虑以下XML:

<flight>
    <airline>Delta</airline>
    <flightno>22</flightno>
    <origin>Atlanta</origin>
    <destination>Paris</destination>
    <departure>5:40pm</departure>
    <arrival>8:10am</arrival>
</flight>
<city id="AT"> 
       <cityname>Athens</cityname> 
       <state>GA</state>
       <description> Home of the University of Georgia</description>
       <population>100,000</population>
       <location>Located about 60 miles Northeast of Atlanta</location>
       <latitude>33 57' 39" N</latitude>
       <longitude>83 22' 42" W</longitude>
</city>
Run Code Online (Sandbox Code Playgroud)

所期望的输出可以是内容city元素,从<city...></city>.

Mar*_*nor 5

使用awk和sed等工具解析XML的解决方案并不完美.您不能依赖XML始终具有人类可读的布局.例如,某些Web服务将省略新行,导致整个XML文档出现在一行上.

我建议使用xmllint,它能够使用XPATH(一种为XML设计的查询语言)选择节点.

以下命令将选择城市标记:

xmllint --xpath "//city" data.xml
Run Code Online (Sandbox Code Playgroud)

XPath非常有用.它使XML文档的每个部分都可寻址:

xmllint --xpath "string(//city[1]/@id)" data.xml
Run Code Online (Sandbox Code Playgroud)

返回字符串"AT".

格式不正确的XML数据

这次返回第一次出现的"city"标签.xmllint也可用于打印结果:

$ xmllint --xpath "//city[1]" data.xml  | xmllint -format -
<?xml version="1.0"?>
<city id="AT">
  <cityname>Athens</cityname>
  <state>GA</state>
  <description> Home of the University of Georgia</description>
  <population>100,000</population>
  <location>Located about 60 miles Northeast of Atlanta</location>
  <latitude>33 57' 39" N</latitude>
  <longitude>83 22' 42" W</longitude>
</city>
Run Code Online (Sandbox Code Playgroud)

data.xml中

在同一数据中,第一个"城市"标签全部出现在一行上.这是有效的XML.

<data>
  <flight>
    <airline>Delta</airline>
    <flightno>22</flightno>
    <origin>Atlanta</origin>
    <destination>Paris</destination>
    <departure>5:40pm</departure>
    <arrival>8:10am</arrival>
  </flight>
  <city id="AT"> <cityname>Athens</cityname> <state>GA</state> <description> Home of the University of Georgia</description> <population>100,000</population> <location>Located about 60 miles Northeast of Atlanta</location> <latitude>33 57' 39" N</latitude> <longitude>83 22' 42" W</longitude> </city>
  <city id="DUB">
    <cityname>Dublin</cityname>
    <state>Dub</state>
    <description> Dublin</description>
    <population>1,500,000</population>
    <location>Ireland</location>
    <latitude>NA</latitude>
    <longitude>NA</longitude>
  </city>
</data>
Run Code Online (Sandbox Code Playgroud)