我喜欢读取日志文件并将其拆分为4个标量.
这是日志文件示例:
[time1] [error1] [who is1] mess is here1
[time2] [error2] mess is here2
Run Code Online (Sandbox Code Playgroud)
并喜欢得到那些标量:
($time, $err, $who, $mess)=('time1', 'error1', 'who is1', 'mess is here1')
($time, $err, $who, $mess)=('time2', 'error2', '', 'mess is here2')
Run Code Online (Sandbox Code Playgroud)
如何在perl中做到这一点?
我的代码就像它,但它不起作用:
while (<MYFILE>) {
chomp;
($time, $err, $who, $mess)=($_ =~/\[([.]*)\] \[([.]*)\] (\[([.]*)\]|[ ])([.]*)/);
$logi.= "<tr><td>$time</td><td>$err</td><td>$who</td><td>$mess</td></tr>\n";
}
Run Code Online (Sandbox Code Playgroud)
这是一种方法,利用编译的正则表达式和/ x标志来更容易读取空白被忽略.
my $block_re = qr{ \[ (.*?) \] }x; # [some thing]
my $log_re = qr{^
$block_re \s+ $block_re \s+ (?: $block_re \s+ )? # two or three blocks
(.*) # the log message
$}x;
while($line = <$fh>) {
my @fields = $line =~ $log_re;
my $message = pop @fields;
my($time, $err, $who) = @fields;
print "time: $time, err: $err, who: $who, message: $message\n";
}
Run Code Online (Sandbox Code Playgroud)
块正则表达式的关键之一是使用"非贪婪"匹配运算符.*?.通常.*会匹配最长的字符串,这意味着m{ \[ .* \] }x将匹配所有"[foo] [bar] [baz]",而不仅仅是"[foo]".通过添加它来告诉它非贪婪,?它将匹配最短的只是"[foo]".
我做的另一个修改是将最后一个字段而不是第四个字段视为消息字段.我怀疑你的格式可以拥有它想要的那些"[foo]"块.