Perl和HTML:UTF8在表单中不起作用

use*_*670 0 html perl utf-8 character-encoding

我试图将我的Perl/HTML文件更改为UTF-8格式.不幸的是我的表格有问题.我创建了一个小测试脚本来举例说明问题.它所做的只是重新加载,以便输入的文本将再次显示.它适用于ASCII字符.一进入德语"Umlaute"(ÄÖÜ),角色就会变形.它也无法处理俄语字符(ЭЯЮ).这是脚本:

#!/usr/bin/perl

use utf8;
use Encode;
use open ':std', ':encoding(UTF-8)';

# Safe query-string in hash:
$querystring = $ENV{ 'QUERY_STRING' };
read(STDIN, $poststring, $ENV{CONTENT_LENGTH});
if (($querystring ne "") && ($poststring ne "")) { $querystring .= "&$poststring"; } 
    else { $querystring .= $poststring; }

$querystring =~ s/&/=/gi;
%query = split( /=/, $querystring );
foreach $key ( keys( %query ) ) {
    $query{$key} =~ tr/+/ /;
    $query{$key} =~ s/%([\da-f][\da-f])/chr( hex($1) )/egi;
    $uquer{$key} = decode_utf8( $query{$key} );
}

print "Content-Type: text/html; charset=\"UTF-8\"\n\n";
print <<END;
    <HTML>
        <HEAD>
            <META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
        </HEAD>
        <BODY>
            <FORM NAME="frmeing" METHOD="POST" ACTION="test0.cgi">
                <INPUT NAME="df_kurs" TYPE="TEXT" VALUE="$uquer{'df_kurs'}">
                <INPUT TYPE="SUBMIT">
            </FORM>
        </BODY>
    </HTML>
END
Run Code Online (Sandbox Code Playgroud)

您也可以测试此脚本.它在网上是这个地址:http: //project-website.org/test/test0.cgi 有谁知道可能是什么问题?预先感谢您的帮助!

ike*_*ami 5

这是由于您的版本中的错误decode_utf8.

$ perl -Mutf8 -MEncode -E'
   $u = $d = encode_utf8("é");
   utf8::upgrade($u);   # Changes how the string is stored internally
   say $u eq $d ?1:0;
   say decode_utf8($d) eq decode_utf8($u) ?1:0;
'
1
0
Run Code Online (Sandbox Code Playgroud)

正如您所看到的,$u并且$d相同,但您的版本对decode_utf8它们进行了不同的解码.具体来说,它返回$u不变.

这已在较新版本的Encode中修复.(2.53,我想.)

解决问题的更简单方法是修复自己的错误.使用use open,你告诉你的程序在转换url-encoding并从UTF-8第二次解码之前,从UTF-8解码STDIN.

固定:

#!/usr/bin/perl

use utf8;                      # Source code is encoded using UTF-8.
use open ':encoding(UTF-8)';   # Set default encoding for file handles.
BEGIN { binmode(STDOUT, ':encoding(UTF-8)'); }  # HTML
BEGIN { binmode(STDERR, ':encoding(UTF-8)'); }  # Error log

use Encode;

# Safe query-string in hash:
$querystring = $ENV{ 'QUERY_STRING' };
read(STDIN, my $poststring, $ENV{CONTENT_LENGTH});
if (($querystring ne "") && ($poststring ne "")) { $querystring .= "&$poststring"; } 
    else { $querystring .= $poststring; }

$querystring =~ s/&/=/gi;
%query = split( /=/, $querystring );
foreach $key ( keys( %query ) ) {
    $query{$key} =~ tr/+/ /;
    $query{$key} =~ s/%([\da-f][\da-f])/chr( hex($1) )/egi;
    $uquer{$key} = decode_utf8( $query{$key} );
}

print "Content-Type: text/html; charset=\"UTF-8\"\n\n";
print <<END;
    <HTML>
        <HEAD>
            <META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
        </HEAD>
        <BODY>
            <FORM NAME="frmeing" METHOD="POST">
                <INPUT NAME="df_kurs" TYPE="TEXT" VALUE="$uquer{'df_kurs'}">
                <INPUT TYPE="SUBMIT">
            </FORM>
        </BODY>
    </HTML>
END
Run Code Online (Sandbox Code Playgroud)

但你真的应该使用CGI.pm.

#!/usr/bin/perl

use strict;    # Always!
use warnings;  # Always!

use utf8;                      # Source code is encoded using UTF-8.
use open ':encoding(UTF-8)';   # Set default encoding for file handles.
BEGIN { binmode(STDOUT, ':encoding(UTF-8)'); }  # HTML
BEGIN { binmode(STDERR, ':encoding(UTF-8)'); }  # Error log

use CGI qw( -utf8 );
use Encode;

my $cgi = CGI->new();
my %uquer = $cgi->Vars();

print $cgi->header('text/html; charset=UTF-8');
print <<END;
    <HTML>
        <HEAD>
            <META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
        </HEAD>
        <BODY>
            <FORM NAME="frmeing" METHOD="POST">
                <INPUT NAME="df_kurs" TYPE="TEXT" VALUE="$uquer{'df_kurs'}">
                <INPUT TYPE="SUBMIT">
            </FORM>
        </BODY>
    </HTML>
END
Run Code Online (Sandbox Code Playgroud)