php*_*ete 1 perl encoding character-encoding
我正在使用Perl从SQLite数据库和WWW:Mechanize模块中获取数据以进行一些Web抓取.
我发布的数据(在数据库中)中有一些™字符,在查看网站上的文字后,它有几个奇怪的字符:â¢而不是™.
我在Perl程序的顶部设置了以下内容.我用它来防止终端中有关"宽字符"的警告.
binmode(STDOUT, ":utf-8");
Run Code Online (Sandbox Code Playgroud)
我对编码/解码字符并不是很了解,所以任何帮助都会有用.
编辑:在阅读了关于Perl IO之后,我能够找到这个解决了我的问题的stackoverflow答案.
解码输入,编码输出.
use open ':std', ':encoding(UTF-8)'; # Outputs are UTF-8
BEGIN { binmode STDIN; } # ...but not the raw CGI request.
use CGI qw( -utf8 ); # Decode parameters
use DBI qw( );
{
my $cgi = CGI->new();
print $cgi->header(
-type => "text/plain", # Just cause it's shorter.
-charset => "UTF-8", # Tell browser encoding used.
);
my $dbh = DBI->connect(
"dbi:SQLite:dbname=/tmp/tmp.sqlite", "", "",
{
AutoCommit => 1,
RaiseError => 1,
PrintError => 0,
PrintWarn => 1,
sqlite_unicode => 1, # Encode and decode for us.
},
);
$dbh->do("CREATE TABLE Testing ( str TEXT )");
Run Code Online (Sandbox Code Playgroud)
my $from_html_parser = "\x{2122}";
# Should be 2122, since the trademark symbol is U+2122.
printf("from_html_parser = %v04X\n", $from_html_parser);
print("$from_html_parser\n");
$dbh->do("INSERT INTO Testing VALUES (?)", undef, $from_html_parser);
Run Code Online (Sandbox Code Playgroud)
my $from_database = $dbh->selectrow_array("SELECT * FROM Testing");
# Should be 2122, since the trademark symbol is U+2122.
printf("from_database = %v04X\n", $from_database);
print("$from_database\n");
}
END { unlink("/tmp/tmp.sqlite"); }
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
756 次 |
| 最近记录: |