如何使用perl的正则表达式匹配汉字

Hai*_*ang 2 regex perl

我需要在utf8编码的html中匹配一些中文字符,我写了一些测试代码如下:

#! /usr/bin/perl

use strict;
use LWP::UserAgent;
use Encode;

my $ua = new LWP::UserAgent;

my $request = HTTP::Request->new('GET');
my $url = 'http://www.boc.cn/sourcedb/whpj/';
$request->url($url);

my $res = $ua->request($request) ;

my $str_chinese =   encode("utf8" ,"??" ) ;  
# my $str_chinese = "??" ;


my $str_english = "English" ;
#my $html = decode("utf8" , $res->content) ;
my $html = $res->content ; 

if ( $html =~ /$str_chinese/ ) {
     print "chinese word matched" ;
}else {
     print "chinese word unmatched\n" ;
}

if ( $html =~ /$str_english/i ) {
    print "english word matched\n" ;
}else {
    print "english word unmatched\n" ;
}
Run Code Online (Sandbox Code Playgroud)

输出显示脚本无法匹配html中嵌入的现有中文字符.你能给我一些如何解决我的问题的提示吗?

Ala*_*avi 7

由于您在源代码中添加了UTF-8字符,因此您必须:

use utf8;
Run Code Online (Sandbox Code Playgroud)

它告诉Perl您的脚本是用UTF-8编写的.