使用Perl提取XML标记

Vel*_*gan 1 xml perl parsing

我需要一个Perl脚本来分隔XMl标记.例如:

<bgtres>
 <resume key='267298871' score='5'>
 <xpath path='xpath://resume'>
 <resume canonversion='2' dateversion='2' present='734060'>........... </resume></xpath></resume>
</bgtres>
Run Code Online (Sandbox Code Playgroud)

在这个XML文件中,我需要将resume标签下的内容(在xpath内)分开,在xpath之后出现的resume标签应该单独从一堆CV中提取出来.我需要在Perl脚本中执行此操作.

任何人都可以给我一个提示或编码来完成这个过程.我需要Perl脚本来执行此过程

提前致谢

Nik*_*ain 5

  • 请参阅XML :: Twig - 用于以树模式处理大型XML文档的perl模块.
  • XML :: Simple - 维护XML的简易API(esp配置文件)

喜欢

use strict;
use warnings;
use XML::Simple;
use Data::Dumper;

my $xml = q~<?xml version='1.0'?>
<bgtres>
 <resume key='267298871' score='5'>
  <xpath path='xpath://resume'>
   <resume canonversion='2' dateversion='2' present='734060'>
   </resume>
  </xpath>
 </resume>
</bgtres>~;

print $xml,$/;

my $data = XMLin($xml);

print Dumper( $data );

foreach my $test (keys %{$data->{resume}{xpath}{resume}}){
        print"$test : $data->{resume}{xpath}{resume}->{$test}\n";
}
Run Code Online (Sandbox Code Playgroud)

输出:

<?xml version='1.0'?>
<bgtres>
 <resume key='267298871' score='5'>
  <xpath path='xpath://resume'>
   <resume canonversion='2' dateversion='2' present='734060'>
   </resume>
  </xpath>
 </resume>
</bgtres>
$VAR1 = {
          'resume' => {
                      'xpath' => {
                                 'resume' => {
                                             'dateversion' => '2',
                                             'canonversion' => '2',
                                             'present' => '734060'
                                           },
                                 'path' => 'xpath://resume'
                               },
                      'score' => '5',
                      'key' => '267298871'
                    }
        };
dateversion : 2
canonversion : 2
present : 734060
Run Code Online (Sandbox Code Playgroud)