kob*_*ame 23 perl utf-8 mason moose plack
重新提出问题,因为
评论:这个问题已经获得了"流行的问题徽章",所以可能我不是唯一没有希望的人.:)
不幸的是,展示完整的问题堆栈导致了一个非常长的问题,这是非常梅森特定的.
首先,只有意见的部分:)
我使用HTML :: Mason多年,现在尝试使用Mason2.在诗人和梅森 都在CPAN最先进的框架.没有找到任何比较,开箱即用的东西允许写得如此干净/但非常黑客:)/ web-apps,包括许多电池(记录,缓存,配置管理,基于原生PGSI等......)
不幸的是,作者并不关心其余部分,例如默认情况下,它只是基于ascii, 没有任何手册,常见问题或建议:如何使用unicode
现在的事实.演示.创建一个诗人应用程序:
poet new my #the "my" directory is the $poet_root
mkdir -p my/comps/xls
cd my/comps/xls
Run Code Online (Sandbox Code Playgroud)
并添加到dhandler.mc下面(将说明两个基本问题)
<%class>
has 'dwl';
use Excel::Writer::XLSX;
</%class>
<%init>
my $file = $m->path_info;
$file =~ s/[^\w\.]//g;
my $cell = lc join ' ', "ÅNGSTRÖM", "in the", $file;
if( $.dwl ) {
#create xlsx in the memory
my $excel;
open my $fh, '>', \$excel or die "Failed open scalar: $!";
my $workbook = Excel::Writer::XLSX->new( $excel );
my $worksheet = $workbook->add_worksheet();
$worksheet->write(0, 0, $cell);
$workbook->close();
#poet/mason output
$m->clear_buffer;
$m->res->content_type("application/vnd.ms-excel");
$m->print($excel);
$m->abort();
}
</%init>
<table border=1>
<tr><td><% $cell %></td></tr>
</table>
<a href="?dwl=yes">download <% $file %></a>
Run Code Online (Sandbox Code Playgroud)
并运行该应用程序
../bin/run.pl
Run Code Online (Sandbox Code Playgroud)
转到http:// 0:5000/xls/hello.xlsx,你会得到:
+----------------------------+
| ÅngstrÖm in the hello.xlsx |
+----------------------------+
download hello.xlsx
Run Code Online (Sandbox Code Playgroud)
单击下载hello.xlsx,您将进入hello.xlsx下载.
上面说明了第一个问题,例如组件的源不在"下" use utf8;,所以lc不理解字符.
第二个问题如下,尝试[ http:// 0:5000/xls /hélló.xlsx],或 http:// 0:5000/xls/h%C3%A9ll%C3%B3.xlsx ,你会看到:
+--------------------------+
| ÅngstrÖm in the hll.xlsx |
+--------------------------+
download hll.xlsx
#note the wrong filename
Run Code Online (Sandbox Code Playgroud)
当然,输入(the path_info)不被解码,脚本使用utf8编码的八位字节而不是perl字符.
所以,告诉perl - "源是在utf8中",通过添加use utf8;到<%class%>结果中
+--------------------------+
| ?ngstr?m in the hll.xlsx |
+--------------------------+
download hll.xlsx
Run Code Online (Sandbox Code Playgroud)
添加use feature 'unicode_strings'(或use 5.014;)甚至更糟:
+----------------------------+
| ?ngstr?m in the h?ll?.xlsx |
+----------------------------+
download h?ll?.xlsx
Run Code Online (Sandbox Code Playgroud)
当然,源现在包含宽字符,它需要Encode::encode_utf8输出.
可以尝试使用过滤器:
<%filter uencode><% Encode::encode_utf8($yield->()) %></%filter>
Run Code Online (Sandbox Code Playgroud)
并过滤整个输出:
% $.uencode {{
<table border=1>
<tr><td><% $cell %></td></tr>
</table>
<a href="?dwl=yes">download <% $file %></a>
% }}
Run Code Online (Sandbox Code Playgroud)
但这只是部分帮助,因为需要关心<%init%>或<%perl%>块中的编码.编码/解码中,在许多地方的Perl代码,(读:不是在边界)导致的spagethy代码.
编码/解码应该清楚地做某个地方的 诗人/梅森边界-当然,在普拉克在字节级别运行.
部分解决方案.
幸运的是,Poet巧妙地允许修改它(和Mason的)部分,所以,在 $poet_root/lib/My/Mason你可以修改Compilation.pm为:
override 'output_class_header' => sub {
return join("\n",
super(), qq(
use 5.014;
use utf8;
use Encode;
)
);
};
Run Code Online (Sandbox Code Playgroud)
什么将所需的序言插入每个梅森组件.(不要忘记触摸每个组件,或者只是从中删除编译对象$poet_root/data/obj).
您也可以尝试通过编辑$poet_root/lib/My/Mason/Request.pmto来处理边界处的请求/响应:
#found this code somewhere on the net
use Encode;
override 'run' => sub {
my($self, $path, $args) = @_;
#decode values - but still missing the "keys" decode
foreach my $k (keys %$args) {
$args->set($k, decode_utf8($args->get($k)));
}
my $result = super();
#encode the output - BUT THIS BREAKS the inline XLS
$result->output( encode_utf8($result->output()) );
return $result;
};
Run Code Online (Sandbox Code Playgroud)
编码一切都是错误的策略,它打破了例如XLS.
所以,4年后(我问2011年的原始问题)仍然不知道:(如何正确使用Mason2应用程序中的unicode ,仍然不存在任何关于它的文档或帮助.:(
主要问题是: - 哪里(Moose方法修饰符应该修改哪些方法)以及如何正确解码输入和输出位置(在诗人/梅森应用程序中)
text/plain或者text/html......有人可以请求真正的代码 - 我应该在上面修改什么?
好的,我已经用 Firefox 测试过了。HTML 正确显示 UTF-8 并单独保留 zip,因此应该在任何地方都可以工作。
\n\n如果您开始poet new My应用您需要的补丁patch -p1 -i...path/to/thisfile.diff。
diff -ruN orig/my/comps/Base.mc new/my/comps/Base.mc\n--- orig/my/comps/Base.mc 2015-05-20 21:48:34.515625000 -0700\n+++ new/my/comps/Base.mc 2015-05-20 21:57:34.703125000 -0700\n@@ -2,9 +2,10 @@\n has \'title\' => (default => \'My site\');\n </%class>\n\n-<%augment wrap>\n- <html>\n+<%augment wrap><!DOCTYPE html>\n+ <html lang="en-US">\n <head>\n+ <meta charset="utf-8">\n <link rel="stylesheet" href="/static/css/style.css">\n % $.Defer {{\n <title><% $.title %></title>\ndiff -ruN orig/my/comps/xls/dhandler.mc new/my/comps/xls/dhandler.mc\n--- orig/my/comps/xls/dhandler.mc 1969-12-31 16:00:00.000000000 -0800\n+++ new/my/comps/xls/dhandler.mc 2015-05-20 21:53:42.796875000 -0700\n@@ -0,0 +1,30 @@\n+<%class>\n+ has \'dwl\';\n+ use Excel::Writer::XLSX;\n+</%class>\n+<%init>\n+ my $file = $m->path_info;\n+ $file = decode_utf8( $file );\n+ $file =~ s/[^\\w\\.]//g;\n+ my $cell = lc join \' \', "\xc3\x85NGSTR\xc3\x96M", "in the", $file ;\n+ if( $.dwl ) {\n+ #create xlsx in the memory\n+ my $excel;\n+ open my $fh, \'>\', \\$excel or die "Failed open scalar: $!";\n+ my $workbook = Excel::Writer::XLSX->new( $fh );\n+ my $worksheet = $workbook->add_worksheet();\n+ $worksheet->write(0, 0, $cell);\n+ $workbook->close();\n+\n+ #poet/mason output\n+ $m->clear_buffer;\n+ $m->res->content_type("application/vnd.ms-excel");\n+ $m->print($excel);\n+ $m->abort();\n+ }\n+</%init>\n+<table border=1>\n+<tr><td><% $cell %></td></tr>\n+</table>\n+<p> <a href="%c3%85%4e%47%53%54%52%c3%96%4d%20%68%c3%a9%6c%6c%c3%b3">\xc3\x85NGSTR\xc3\x96M h\xc3\xa9ll\xc3\xb3</a>\n+<p> <a href="?dwl=yes">download <% $file %></a>\ndiff -ruN orig/my/lib/My/Mason/Compilation.pm new/my/lib/My/Mason/Compilation.pm\n--- orig/my/lib/My/Mason/Compilation.pm 2015-05-20 21:48:34.937500000 -0700\n+++ new/my/lib/My/Mason/Compilation.pm 2015-05-20 21:49:54.515625000 -0700\n@@ -5,11 +5,13 @@\n extends \'Mason::Compilation\';\n\n # Add customizations to Mason::Compilation here.\n-#\n-# e.g. Add Perl code to the top of every compiled component\n-#\n-# override \'output_class_header\' => sub {\n-# return join("\\n", super(), \'use Foo;\', \'use Bar qw(baz);\');\n-# };\n-\n+override \'output_class_header\' => sub {\n+ return join("\\n",\n+ super(), qq(\n+ use 5.014;\n+ use utf8;\n+ use Encode;\n+ )\n+ );\n+};\n 1;\n\\ No newline at end of file\ndiff -ruN orig/my/lib/My/Mason/Request.pm new/my/lib/My/Mason/Request.pm\n--- orig/my/lib/My/Mason/Request.pm 2015-05-20 21:48:34.968750000 -0700\n+++ new/my/lib/My/Mason/Request.pm 2015-05-20 21:55:03.093750000 -0700\n@@ -4,20 +4,27 @@\n\n extends \'Mason::Request\';\n\n-# Add customizations to Mason::Request here.\n-#\n-# e.g. Perform tasks before and after each Mason request\n-#\n-# override \'run\' => sub {\n-# my $self = shift;\n-#\n-# do_tasks_before_request();\n-#\n-# my $result = super();\n-#\n-# do_tasks_after_request();\n-#\n-# return $result;\n-# };\n+use Encode qw/ encode_utf8 decode_utf8 /;\n\n-1;\n\\ No newline at end of file\n+override \'run\' => sub {\n+ my($self, $path, $args) = @_;\n+ foreach my $k (keys %$args) {\n+ my $v = $args->get($k);\n+ $v=decode_utf8($v);\n+ $args->set($k, $v);\n+ }\n+ my $result = super();\n+ my( $ctype, $charset ) = $self->res->headers->content_type_charset;\n+ if( ! $ctype ){\n+ $ctype = \'text/html\';\n+ $charset = \'UTF-8\';\n+ $self->res->content_type( "$ctype; $charset");\n+ $result->output( encode_utf8(\'\'.( $result->output())) );\n+ } elsif( ! $charset and $ctype =~ m{text/(?:plain|html)} ){\n+ $charset = \'UTF-8\';\n+ $self->res->content_type( "$ctype; $charset");\n+ $result->output( encode_utf8(\'\'.( $result->output())) );\n+ }\n+ return $result;\n+};\n+1;\nRun Code Online (Sandbox Code Playgroud)\n