Tho*_*ing 5 postgresql perl filehandle perl-io
我有一个访问PostgreSQL数据库的应用程序,需要根据一些需要的处理从中读取一些大的二进制数据.这可能是数百MB甚至数GB的数据.请不要讨论使用文件系统等等,它就像现在这样.
该数据只是各种类型的文件,例如它可能是Zip容器或其他类型的存档.一些需要的处理是列出Zip的内容,甚至可以提取一些成员进行进一步处理,也可以散列存储的数据......最后,数据被多次读取,但只写入一次以存储它.
我使用的所有Perl库都可以使用文件句柄,有些用于IO::Handle,有些用于IO::String或者IO::Scalar,其他一些只有低级文件句柄.所以我所做的就是创建的子类IO::Handle和IO::Seekable它就像周边的相应方法的包装DBD::Pg.在CTOR中,我创建了一个与数据库的连接,打开一些提供的LOID用于读取并存储Postgres在实例中提供的句柄.然后我自己的句柄对象被转发给能够使用这种文件句柄的人,并且可以直接在Postgres提供的blob中读取和搜索.
问题是使用低级文件句柄或低级文件句柄操作的库IO::Handle.Digest::MD5似乎是Archive::Zip另一个.Digest::MD5 croak并且告诉我没有提供句柄,Archive::Zip另一方面尝试从我的创建一个新的,自己的句柄,IO::Handle::fdopen在我的情况下调用和失败.
sub fdopen {
@_ == 3 or croak 'usage: $io->fdopen(FD, MODE)';
my ($io, $fd, $mode) = @_;
local(*GLOB);
if (ref($fd) && "".$fd =~ /GLOB\(/o) {
# It's a glob reference; Alias it as we cannot get name of anon GLOBs
my $n = qualify(*GLOB);
*GLOB = *{*$fd};
$fd = $n;
} elsif ($fd =~ m#^\d+$#) {
# It's an FD number; prefix with "=".
$fd = "=$fd";
}
open($io, _open_mode_string($mode) . '&' . $fd)
? $io : undef;
}
Run Code Online (Sandbox Code Playgroud)
我想问题是句柄的低级副本,它删除了我自己的实例,因此没有实例拥有我的数据库连接和所有这些东西.
那么,在我的情况下,是否有可能提供一些IO::Handle成功可以在任何预期低级文件句柄的地方使用?
我的意思是,我没有真正的文件句柄,我只有一个对象,方法调用被包装到相应的Postgres方法,需要数据库句柄等.所有这些数据都需要存储在某个地方,需要完成包装等.
我试图做别人正在做的事情,例如IO::String,另外还使用tie了.但最终用例是不同的,因为Perl能够自己创建一个真正的低级文件句柄到一些内部内存.在我的情况下根本不支持的东西.我需要保持我的实例,因为只知道数据库的句柄等.
使用我的句柄就像一个IO::Handle通过调用方法read和类似预期的工作,但我想更进一步,并与那些不期望在IO::Handle对象上工作的人更加兼容.非常喜欢IO::String或File::Temp可以用作低级文件句柄.
package ReadingHandle;
use strict;
use warnings;
use 5.10.1;
use base 'IO::Handle', 'IO::Seekable';
use Carp ();
sub new
{
my $invocant = shift || Carp::croak('No invocant given.');
my $db = shift || Carp::croak('No database connection given.');
my $loid = shift // Carp::croak('No LOID given.');
my $dbHandle = $db->_getHandle();
my $self = $invocant->SUPER::new();
*$self->{'dbHandle'} = $dbHandle;
*$self->{'loid'} = $loid;
my $loidFd = $dbHandle->pg_lo_open($loid, $dbHandle->{pg_INV_READ});
*$self->{'loidFd'} = $loidFd;
if (!defined($loidFd))
{
Carp::croak("The provided LOID couldn't be opened.");
}
return $self;
}
sub DESTROY
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
$self->close();
}
sub _getDbHandle
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
return *$self->{'dbHandle'};
}
sub _getLoid
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
return *$self->{'loid'};
}
sub _getLoidFd
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
return *$self->{'loidFd'};
}
sub binmode
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
return 1;
}
sub close
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
my $dbHandle = $self->_getDbHandle();
my $loidFd = $self->_getLoidFd();
return $dbHandle->pg_lo_close($loidFd);
}
sub opened
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
my $loidFd = $self->_getLoidFd();
return defined($loidFd) ? 1 : 0;
}
sub read
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
my $buffer =\shift // Carp::croak('No buffer given.');
my $length = shift // Carp::croak('No amount of bytes to read given.');
my $offset = shift || 0;
if ($offset > 0)
{
Carp::croak('Using an offset is not supported.');
}
my $dbHandle = $self->_getDbHandle();
my $loidFd = $self->_getLoidFd();
return $dbHandle->pg_lo_read($loidFd, $buffer, $length);
}
sub seek
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
my $offset = shift // Carp::croak('No offset given.');
my $whence = shift // Carp::croak('No whence given.');
if ($offset < 0)
{
Carp::croak('Using a negative offset is not supported.');
}
if ($whence != 0)
{
Carp::croak('Using a whence other than 0 is not supported.');
}
my $dbHandle = $self->_getDbHandle();
my $loidFd = $self->_getLoidFd();
my $retVal = $dbHandle->pg_lo_lseek($loidFd, $offset, $whence);
$retVal = defined($retVal) ? 1 : 0;
return $retVal;
}
sub tell
{
my $self = shift || Carp::croak('The method needs to be called with an instance.');
my $dbHandle = $self->_getDbHandle();
my $loidFd = $self->_getLoidFd();
my $retVal = $dbHandle->pg_lo_lseek($loidFd);
$retVal = defined($retVal) ? $retVal : -1;
return $retVal;
}
1;
Run Code Online (Sandbox Code Playgroud)
有一种方法可以解决这个问题,但有点奇怪。如果我正确阅读您的代码和注释,您的要求基本上有三重:
Archive::Zip,它主要在常规 Perl 中实现,并且调用IO::Handle::fdopen您发布的代码,该代码无法复制句柄,因为它不是真正的句柄。Digest::MD5,它是使用PerlIO在 XS 中实现的。由于tie基于 - 的技巧和 perl 内存中的“假”文件句柄在该级别不可用,因此它比 2 更复杂。您可以通过将PerlIO 层与PerlIO::via. 该代码与您编写的代码类似tie(实现一些必需的行为方法)。此外,您可以利用 的“将变量作为文件打开”功能open和 的预滚动IO::Seekable+IO::Handle功能IO::File来简化实现上述要求 1(使其可以像普通IO::Handle对象一样在 Perl 代码中使用)。
下面是一个示例包,可以满足您的需要。它有一些注意事项:
linesarrayref 作为文件数据。如果这看起来适合您的用例,您应该对其进行调整以与数据库一起使用。SEEK、EOF、BINMODE、SEEK等一无所知)。tie请注意,您将要实现的函数的参数/预期行为与您为or所做的不同Tie::Handle;“接口”具有相同的名称,但契约不同。*$self->{args}glob 字段中的所有自定义状态。这是因为受祝福的对象被创建了两次(一次由 PerlIO 祝福,一次由SUPER::new),因此需要通过共享引用来共享状态。如果替换args字段或添加/删除任何其他字段,它们将仅对创建它们的方法集可见:PerlIO 方法或“普通”对象方法。有关详细信息,请参阅构造函数中的注释。sysread如果在像or这样的低级操作下出现问题<$fh>,很多代码将会出错或做意想不到的事情,因为它认为这些函数在操作级别无法死亡/原子性。类似地,当弄乱 PerlIO 时,故障模式很容易逃脱“死亡或返回错误值”的领域,并最终进入“段错误或核心转储”的领域,特别是当涉及多个进程 ( ) 或线程时(这些fork())奇怪的情况是,例如,为什么下面的模块没有在IO::File->new;后面实现$file->open(... "via:<($class)");它为我核心转储,不知道为什么)。TL;DR 调试 PerlIO 级别出错的原因可能很烦人,有人警告过您:)Digest::MD5不适用于绑定手柄,因为它的操作级别“低于”tie魔法;PerlIO 比它“低”一个级别,但下面还有另一个级别。open(),跳过所有奇怪的伪间接对象内容,然后将其包装在 IO::Handle 中其他方式,例如通过IO::Wrap。包裹:
package TiedThing;
use strict;
use warnings;
use parent "IO::File";
our @pushargs;
sub new {
my ( $class, $args ) = @_;
# Build a glob to be used by the PerlIO methods. This does two things:
# 1. Gets us a place to stick a shared hashref so PerlIO methods and user-
# -defined object methods can manipulate the same data. They must use the
# {args} glob field to do that; new fields written will .
# 2. Unifies the ways of addressing that across custom functions and PerlIO
# functions. We could just pass a hashref { args => $args } into PUSHED, but
# then we'd have to remember "PerlIO functions receive a blessed hashref,
# custom functions receive a blessed glob" which is lame.
my $glob = Symbol::gensym();
*$glob->{args} = $args;
local @pushargs = ($glob, $class);
my $self = $class->SUPER::new(\my $unused, "<:via($class)");
*$self->{args} = $args;
return $self;
}
sub custom {
my $self = shift;
return *$self->{args}->{customvalue};
}
sub PUSHED { return bless($pushargs[0], $pushargs[1]); }
sub FILL { return shift(@{*$_[0]->{args}->{lines}}); }
1;
Run Code Online (Sandbox Code Playgroud)
用法示例:
my $object = TiedThing->new({
lines => [join("\n", 1..9, 1..9)],
customvalue => "custom!",
});
say "can call custom method: " . $object->custom;
say "raw read with <>: " . <$object>;
my $buf;
read($object, $buf, 10);
say "raw read with read(): " . $buf;
undef $buf;
$object->read($buf, 10);
say "OO read via IO::File::read (end): " . $buf;
my $checksummer = Digest::MD5->new;;
$checksummer->addfile($object);
say "Md5 read: " . $checksummer->hexdigest;
my $dupto = IO::Handle->new;
# Doesn't break/return undef; still not usable without implementing
# more state sharing inside the object.
say "Can dup handle: " . $dupto->fdopen($object, "r");
my $archiver = Archive::Zip->new;
# Dies, but long after the fdopen() call. Can be fixed by implementing more
# PerlIO methods.
$archiver->readFromFileHandle($object);
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
127 次 |
| 最近记录: |