在Perl中,我如何检查字符串中指定的编码是否有效?

Sin*_*nür 10 perl file-io character-encoding

说,我有一个接收两个参数的子:编码规范和文件路径.然后sub使用该信息打开一个文件进行读取,如下所示,其中包含以下内容:

run({
    encoding => 'UTF-16---LE',
    input_filename => 'test_file.txt',
});

sub run {
    my $args = shift;
    my ($enc, $fn) = @{ $args }{qw(encoding input_filename)};

    my $is_ok = open my $in,
        sprintf('<:encoding(%s)', $args->{encoding}),
        $args->{input_filename}
    ;
}
Run Code Online (Sandbox Code Playgroud)

现在,这呱呱叫:

Cannot find encoding "UTF-16---LE" at E:\Home\...

在插入第二个参数之前,确保保持有效编码规范的正确方法是什么?$args->{encoding}open

更新

提供以下信息是希望它在某些时候对某人有用.我还要提交错误报告.

Encode :: Alias的文档根本没有提及find_alias.随便看看Encode/Alias.pm我的Windows系统显示:

# Public, encouraged API is exported by default

our @EXPORT =
  qw (
  define_alias
  find_alias
);
Run Code Online (Sandbox Code Playgroud)

但请注意:

#!/usr/bin/env perl

use 5.014;
use Encode::Alias;
say find_alias('UTF-8')->name;
Run Code Online (Sandbox Code Playgroud)

收益率:

Use of uninitialized value $find in exists at C:/opt/Perl/lib/Encode/Alias.pm line 25. Use of uninitialized value $find in hash element at C:/opt/Perl/lib/Encode/Alias.pm line 26. Use of uninitialized value $find in pattern match (m//) at C:/opt/Perl/lib/Encode/Alias.pm line 31. Use of uninitialized value $find in lc at C:/opt/Perl/lib/Encode/Alias.pm line 40. Use of uninitialized value $find in pattern match (m//) at C:/opt/Perl/lib/Encode/Alias.pm line 31. Use of uninitialized value $find in lc at C:/opt/Perl/lib/Encode/Alias.pm line 40.

懒惰,2)首先假设我做错了什么,我决定寻求别人的智慧.

在任何情况下,该错误都是由于find_alias导出为函数而不在代码中检查:

sub find_alias {
    require Encode;
    my $class = shift;
    my $find  = shift;
    unless ( exists $Alias{$find} ) {
Run Code Online (Sandbox Code Playgroud)

如果find_alias不调用一个方法,现在的论点是$class$find是不确定的.

HTH.

dax*_*xim 5

Encode::Alias->find_alias($encoding_name)返回一个对象,其name属性是成功时的规范编码名称,失败时返回false.

$ Encode::Alias->find_alias('UTF-16---LE')
$ Encode::Alias->find_alias('UTF-16 LE')
Encode::Unicode  {
    Parents       Encode::Encoding
    Linear @ISA   Encode::Unicode, Encode::Encoding
    public methods (6) : bootstrap, decode, decode_xs, encode, encode_xs, renew
    private methods (0)
    internals: {
        endian   "v",
        Name   "UTF-16LE",
        size   2,
        ucs2   ""
    }
}
$ Encode::Alias->find_alias('Latin9')
Encode::XS  {
    public methods (9) : cat_decode, decode, encode, mime_name, name, needs_lines, perlio_ok, renew, renewed
    private methods (0)
    internals: 140076283926592
}
$ Encode::Alias->find_alias('UTF-16 LE')->name
UTF-16LE
$ Encode::Alias->find_alias('Latin9')->name
iso-8859-15
Run Code Online (Sandbox Code Playgroud)


cjm*_*cjm 4

您可以使用Encodefind_encoding中的函数。不过,如果您想将其用作图层,您还应该检查. 编码可能存在但不支持使用::encodingperlio_ok:encoding

use Carp qw(croak);
use Encode qw(find_encoding);

sub run {
    my $args = shift;
    my $enc = find_encoding($args->{encoding}) 
      or croak "$args->{encoding} is not a valid encoding";
    $enc->perlio_ok or croak "$args->{encoding} does not support PerlIO";

    my $is_ok = open my $in,
        sprintf('<:encoding(%s)', $enc->name),
        $args->{input_filename}
    ;
}
Run Code Online (Sandbox Code Playgroud)

注意:find_encoding 确实处理由 Encode::Alias 定义的别名。

如果您不关心区分不存在的编码和不支持的编码:encoding,则可以使用该perlio_ok函数:

Encode::perlio_ok($args->{encoding}) or croak "$args->{encoding} not supported";
Run Code Online (Sandbox Code Playgroud)