Perl 套接字从网络流中解析数据包

Vin*_*ris 0 sockets perl parsing stream

我试图找到一种使用 perl 解析数据流的正确方法。我已经阅读了许多示例、文档和问题,但找不到基本上如何从数据流中剪切“包”并处理它。情况如下: - 从某个 IP 到 IP 和端口的数据流 - 流包含一些乱码,然后在 和 之间包含一些内容,其中的数据以分号分隔

到目前为止,我的尝试是让 Socket 侦听端口并处理 $data var:

#!/usr/bin/perl
    use IO::Socket::INET;
    # auto-flush on socket
    $| = 1;

# creating a listening socket
my $socket = new IO::Socket::INET (
    LocalHost => '127.0.0.1',
    LocalPort => '7070',
    Proto => 'tcp',
    Listen => 5,
    Reuse => 1
);
die "cannot create socket $!\n" unless $socket;
print "server waiting for client connection on port 7070 \n";

while(1)
{
    # waiting for a new client connection
    my $client_socket = $socket->accept();

    # get information about a newly connected client
    my $client_address = $client_socket->peerhost();
    my $client_port = $client_socket->peerport();
    print "connection from $client_address:$client_port\n";

    # read up to 1024 characters from the connected client
    my $data = "";
    $client_socket->recv($data, 1024);
    print "received data: $data\n";

    @data_array = split(/;/,$data);
    foreach (@data_array) {
      print "$_\n";
    }

    # write response data to the connected client
    $data = "ok";
    $client_socket->send($data);

    # notify client that response has been sent
    shutdown($client_socket, 1);
}

$socket->close();
Run Code Online (Sandbox Code Playgroud)

这是可行的,但据我了解,这会将整个流放入指定大小,然后对其进行处理。

我的问题:我如何识别我需要的部分(开始-结束),处理它,然后继续下一个?

ike*_*ami 5

我一直不明白为什么人们使用recv流套接字读取数据。

通常,读取循环如下所示:

my $buf = '';
while (1) {
   my $rv = sysread($socket, $buf, 64*1024, length($buf));
   if (!defined($rv)) {
      die("Can't read from socket: $!\n");
   }

   if (!$rv) {
      die("Can't read from socket: Premature EOF\n") if length($buf);
      last;
   }

   while (my $msg = defined(check_for_full_message_and_extract_it_from_buf($buf))) {
      process_msg($msg);
   }
}
Run Code Online (Sandbox Code Playgroud)

(请记住,只要有一些数据,sysread 就会返回,即使数据少于请求的数据。)

例如,哨兵终止数据的内部循环如下所示:

   while ($buf =~ s/^(.*)\n//) {
      process_msg("$1");
   }
Run Code Online (Sandbox Code Playgroud)

例如,长度前缀块的内部循环如下所示:

   while (1) {
      last if length($buf) < 4;

      my $len = unpack('N', $buf);
      last if length($buf) < 4+$len;

      substr($buf, 0, 4, '');
      my $msg = substr($buf, 0, $len, '');
      process_msg($msg);
   }
Run Code Online (Sandbox Code Playgroud)

如果您是特殊情况,您可以从一开始就删除$buf您想要忽略的所有数据,直到到达您感兴趣的部分,然后您将开始提取您感兴趣的项目。这是模糊的,但我对要使用的协议只有一个模糊的描述。