我试图在Perl6中逐行读取一个巨大的gz文件.
我正在尝试做这样的事情
my $file = 'huge_file.gz';
for $file.IO.lines -> $line {
say $line;
}
Run Code Online (Sandbox Code Playgroud)
但是这给出了错误,即我有一个格式错误的UTF-8.我无法看到如何从帮助页面https://docs.perl6.org/language/unicode#UTF8-C8或https://docs.perl6.org/language/io阅读gzip资料
我想完成与Perl5相同的事情:http://blog-en.openalfa.com/how-to-read-and-write-compressed-files-in-perl
如何在Perl6中逐行读取gz文件?
谢谢
tim*_*imo 11
我建议将模块Compress::Zlib用于此目的.你可以在github上找到自述文件和代码并安装它zef install Compress::Zlib.
此示例取自标题为"wrap"的测试文件编号3:
use Test;
use Compress::Zlib;
gzspurt("t/compressed.gz", "this\nis\na\ntest");
my $wrap = zwrap(open("t/compressed.gz"), :gzip);
is $wrap.get, "this\n", 'first line roundtrips';
is $wrap.get, "is\n", 'second line roundtrips';
is $wrap.get, "a\n", 'third line roundtrips';
is $wrap.get, "test", 'fourth line roundtrips';
Run Code Online (Sandbox Code Playgroud)
这可能是获得您想要的最简单方法.
使用 read-file-content该方法存档:: Libarchive模块,但我不知道是否该方法读取所有行到内存中一次:
use Archive::Libarchive;
use Archive::Libarchive::Constants;
my $a = Archive::Libarchive.new: operation => LibarchiveRead, file => 'test.tar.gz';
my Archive::Libarchive::Entry $e .= new;
my $log = '';
while $a.next-header($e) {
$log = get-log($a,$e) if $e.pathname.ends-with('.txt');
}
sub get-log($a, $e) {
return $a.read-file-content($e).decode('UTF8-C8');
}
Run Code Online (Sandbox Code Playgroud)
如果您正在使用快速解决方案,则可以从gzip进程的stdout管道中读取行:
my $proc = run :out, "gzip", "--to-stdout", "--decompress", "MyFile.gz"
for $proc.out.lines -> $line {
say $line;
}
$proc.out.close;
Run Code Online (Sandbox Code Playgroud)