在 Powershell 中自动将 XML 转换为 CSV

Har*_* CO 5 powershell xml

我有几千个不同格式的 xml 文件,我正在尝试将其转换为 CSV。我注意到在 Excel 中,您可以“作为 XML 表”打开一个 XML 文件,它会根据 XML 源数据创建一个架构。有没有办法在 Powershell 中使用此 Excel 功能,以便我可以自动转换这些文件?

到目前为止,我所有的研究都在您需要知道模式来解析文件的地方结束,考虑到我正在处理的卷,这没有帮助。

这是一个示例:

<?xml version='1.0' encoding='UTF-8'?>
<results exportTime="2016-02-11 21:44:04 CST">
    <report label="S Report 5x" slug="L211">
        <record reporting_date_end="03/28/2014" narrative="null">
            <report label="Current Volume">
                <record central_volume_loads="166.79" zone1_volume="233.26" zone3_volume="17.52" zone2_volume="48.94" zone1_volume_pounds="6,671,777" zone4_volume_pounds="9,330,260" zone2_volume_pounds="700,694" zone3_volume_pounds="1,957,789"/>
            </report>
            <report label="Zone1">
                <record imps_percentage="92-94%" total_units="794,007" total_loads="19.85" weighted_average="274.76"/>
                <record imps_percentage="90%" total_units="1,091,657" total_loads="27.29" weighted_average="264.35"/>
                <record imps_percentage="85%" total_units="3,467,452" total_loads="86.69" weighted_average="228.20"/>
                <record imps_percentage="81%" total_units="12,501" total_loads=".31" weighted_average="228.24"/>
                <record imps_percentage="75%" total_units="195,690" total_loads="4.89" weighted_average="187.30"/>
                <record imps_percentage="73%" total_units="0" total_loads=".00" weighted_average=".00"/>
                <record imps_percentage="65%" total_units="1,110,470" total_loads="27.76" weighted_average="166.04"/>
            </report>
            <report label="Zone2">
                <record imps_percentage="92-94%" total_units="891,007" total_loads="22.28" weighted_average="275.26"/>
                <record imps_percentage="90%" total_units="1,813,306" total_loads="45.33" weighted_average="265.04"/>
                <record imps_percentage="85%" total_units="4,378,347" total_loads="109.46" weighted_average="231.04"/>
                <record imps_percentage="81%" total_units="256,289" total_loads="6.41" weighted_average="224.46"/>
                <record imps_percentage="75%" total_units="234,200" total_loads="5.86" weighted_average="183.03"/>
                <record imps_percentage="73%" total_units="0" total_loads=".00" weighted_average=".00"/>
                <record imps_percentage="65%" total_units="1,757,111" total_loads="43.93" weighted_average="165.35"/>
            </report>
        </record>
        <record reporting_date_end="03/21/2014" narrative="null">
            <report label="Current Volume">
                <record central_volume_loads="144.13" zone1_volume="215.06" zone3_volume="27.46" zone2_volume="43.47" zone1_volume_pounds="5,765,336" zone4_volume_pounds="8,602,512" zone2_volume_pounds="1,098,540" zone3_volume_pounds="1,738,636"/>
            </report>
            <report label="Zone1">
                <record imps_percentage="92-94%" total_units="857,233" total_loads="21.43" weighted_average="268.96"/>
                <record imps_percentage="90%" total_units="826,451" total_loads="20.66" weighted_average="260.82"/>
                <record imps_percentage="85%" total_units="3,104,954" total_loads="77.62" weighted_average="223.99"/>
                <record imps_percentage="81%" total_units="8,145" total_loads=".20" weighted_average="248.76"/>
                <record imps_percentage="75%" total_units="136,304" total_loads="3.41" weighted_average="183.96"/>
                <record imps_percentage="73%" total_units="0" total_loads=".00" weighted_average=".00"/>
                <record imps_percentage="65%" total_units="832,249" total_loads="20.81" weighted_average="164.72"/>
            </report>
            <report label="Zone2">
                <record imps_percentage="92-94%" total_units="995,233" total_loads="24.88" weighted_average="270.50"/>
                <record imps_percentage="90%" total_units="1,696,678" total_loads="42.42" weighted_average="262.66"/>
                <record imps_percentage="85%" total_units="3,929,416" total_loads="98.24" weighted_average="227.55"/>
                <record imps_percentage="81%" total_units="269,549" total_loads="6.74" weighted_average="220.32"/>
                <record imps_percentage="75%" total_units="181,125" total_loads="4.53" weighted_average="186.37"/>
                <record imps_percentage="73%" total_units="0" total_loads=".00" weighted_average=".00"/>
                <record imps_percentage="65%" total_units="1,530,511" total_loads="38.26" weighted_average="165.33"/>
            </report>
        </record>
        <record reporting_date_end="03/14/2014" narrative="null">
            <report label="Current Volume">
                <record central_volume_loads="126.80" zone1_volume="186.95" zone3_volume="25.49" zone2_volume="34.67" zone1_volume_pounds="5,071,823" zone4_volume_pounds="7,478,136" zone2_volume_pounds="1,019,473" zone3_volume_pounds="1,386,840"/>
            </report>
            <report label="Zone1">
                <record imps_percentage="92-94%" total_units="738,249" total_loads="18.46" weighted_average="263.39"/>
                <record imps_percentage="90%" total_units="783,791" total_loads="19.59" weighted_average="254.45"/>
                <record imps_percentage="85%" total_units="2,586,286" total_loads="64.66" weighted_average="220.72"/>
                <record imps_percentage="81%" total_units="49,348" total_loads="1.23" weighted_average="191.71"/>
                <record imps_percentage="75%" total_units="131,844" total_loads="3.30" weighted_average="182.33"/>
                <record imps_percentage="73%" total_units="0" total_loads=".00" weighted_average=".00"/>
                <record imps_percentage="65%" total_units="782,305" total_loads="19.56" weighted_average="150.54"/>
            </report>
        </record>
</report>
</results>
Run Code Online (Sandbox Code Playgroud)

Excel 结果如下:

exportTime  label   slug    reporting_date_end  narrative   label2  central_volume_loads    zone1_volume    zone3_volume    zone2_volume    zone1_volume_pounds zone4_volume_pounds zone2_volume_pounds zone3_volume_pounds imps_percentage total_units total_loads weighted_average
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Current Volume  166.79  233.26  17.52   48.94   6,671,777   9,330,260   700,694 1,957,789               
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   92-94%  794,007 19.85   274.76
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   90% 1,091,657   27.29   264.35
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   85% 3,467,452   86.69   228.20
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   81% 12,501  .31 228.24
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   75% 195,690 4.89    187.30
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   73% 0   .00 .00
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone1                                   65% 1,110,470   27.76   166.04
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   92-94%  891,007 22.28   275.26
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   90% 1,813,306   45.33   265.04
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   85% 4,378,347   109.46  231.04
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   81% 256,289 6.41    224.46
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   75% 234,200 5.86    183.03
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   73% 0   .00 .00
2016-02-11 21:44:04 CST S Report 5x L211    03/28/2014  null    Zone2                                   65% 1,757,111   43.93   165.35
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Current Volume  144.13  215.06  27.46   43.47   5,765,336   8,602,512   1,098,540   1,738,636               
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   92-94%  857,233 21.43   268.96
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   90% 826,451 20.66   260.82
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   85% 3,104,954   77.62   223.99
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   81% 8,145   .20 248.76
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   75% 136,304 3.41    183.96
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   73% 0   .00 .00
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone1                                   65% 832,249 20.81   164.72
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   92-94%  995,233 24.88   270.50
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   90% 1,696,678   42.42   262.66
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   85% 3,929,416   98.24   227.55
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   81% 269,549 6.74    220.32
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   75% 181,125 4.53    186.37
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   73% 0   .00 .00
2016-02-11 21:44:04 CST S Report 5x L211    03/21/2014  null    Zone2                                   65% 1,530,511   38.26   165.33
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Current Volume  126.8   186.95  25.49   34.67   5,071,823   7,478,136   1,019,473   1,386,840               
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   92-94%  738,249 18.46   263.39
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   90% 783,791 19.59   254.45
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   85% 2,586,286   64.66   220.72
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   81% 49,348  1.23    191.71
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   75% 131,844 3.30    182.33
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   73% 0   .00 .00
2016-02-11 21:44:04 CST S Report 5x L211    03/14/2014  null    Zone1                                   65% 782,305 19.56   150.54
Run Code Online (Sandbox Code Playgroud)

har*_*ymc 1

首先,您的 XML 是错误的,因为它包含嵌套在自身内部的元素。report这与和项目有关record我已在Pastebin中发布了该文件的固定版本,其中我已将外部reportrecord项目重命名为xreportand xrecord

其次,由于它包含嵌套元素,因此不能简单地将它们转换为 CSV,因为一个 CSV 文件只能包含一种类型的元素。

由于嵌套问题,PowerShell 无法对此类文件使用,因为它只能执行简单的转换。 有关示例,请参阅文章Cool powershell:从 Xml 到 Csv 在两行中。

PowerShell 可以手动编码来转换已知结构的 XML 文件。例如,请参阅这些文章:

可以处理这种复杂性的工具是免费的 Xml To Csv Conversion Tool

为了使其工作,您需要下载并解压以下内容:

一旦解压到一个文件夹中,您就可以使用交互式程序来测试其对一两个文件的功能。

如果它满足您的需要,您可以使用控制台变体来批量转换文件。批处理命令的格式为:

XmlToCsv.Console.exe -xml <source.xml> -dir <destination-folder>
Run Code Online (Sandbox Code Playgroud)

批处理文件需要将具有相似结构的文件转换为各自的文件夹,因为它将上述示例文件转换为五个名为 record.csv、report.csv、results.csv、xrecord.csv、xreport.csv 的文件。由于这些名称对于结构相似的 XML 文件重复,因此如果生成的文件全部定向到同一文件夹,则它们将相互覆盖。


如果这个程序不能满足您的需求,您将需要编写一个 Powershell 脚本,该脚本应该有两个阶段:

  1. 第 1 阶段:通过深入到最低级别创建 CSV 标头
  2. 第 2 阶段:深度搜索整个 XML 以生成 CSV 数据行。

您的脚本可以循环遍历文件,每个文件执行以下操作:

[xml] $xml = [xml] (Get-Content -Path \path\to\file.xml)
[System.Xml.XmlElement] $root = $xml.get_DocumentElement()
Run Code Online (Sandbox Code Playgroud)

$root 包含<results>具有以下属性的节点:

获取会员

请注意,您正在查找 MemberType 为 Property 的项目。具有字符串定义的项目是您需要收集的项目,而 System.Object 则需要再次钻取:

在此输入图像描述

然后再次 :

在此输入图像描述

这次我们需要深入了解具有 System.Xml.XmlElement 定义的项目:

在此输入图像描述

现在您已拥有所有属性并可以构建标题行。

脚本本身需要使用forach命令循环遍历对象的属性并检查 MemberType 和 Definition 的属性。

我无法承担编写完整脚本所需的时间,但希望这可以帮助您开始。