如果要检查recv_ready(),是否必须检查exit_status_ready？

Question

如果要检查recv_ready(),是否必须检查exit_status_ready？

我正在运行一个远程命令:

ssh = paramiko.SSHClient()
ssh.connect(host)
stdin, stdout, stderr = ssh.exec_command(cmd)

Run Code Online (Sandbox Code Playgroud)

现在我想得到输出.我见过这样的事情:

# Wait for the command to finish
while not stdout.channel.exit_status_ready():
    if stdout.channel.recv_ready():
        stdoutLines = stdout.readlines()

Run Code Online (Sandbox Code Playgroud)

但这似乎有时从不运行readlines()(即使应该有关于stdout的数据).这似乎对我意味着stdout.channel.exit_recat_ready()一旦stdout.channel.exit_status_ready()为True,stdout.channel.recv_ready()就不一定准备就绪(True).

这样的事情合适吗？

# Wait until the data is available
while not stdout.channel.recv_ready():
    pass

stdoutLines = stdout.readlines()

Run Code Online (Sandbox Code Playgroud)

也就是说,在等待recv_ready()数据准备就绪之前,我是否真的必须首先检查退出状态？

如果stdout.channel.recv_ready()在无限循环中等待成为True(如果不应该是任何stdout输出则不会),我怎么知道是否应该在stdout上有数据？

Answer 1

tin*_*tin 22

也就是说,在等待recv_ready()数据准备就绪之前,我是否真的必须首先检查退出状态？

stdout/stderr没有.即使它还没有完成,从远程进程接收数据(例如)也是完全可以的.另外一些sshd实现甚至不提供远程proc的退出状态,在这种情况下你会遇到问题,请参阅paramiko doc:exit_status_ready.

等待exit_status_code短生命远程命令的问题是,本地线程可能比检查循环条件更快地接收exit_code.在这种情况下,您将永远不会进入循环,并且永远不会调用readlines().这是一个例子:

# spawns new thread to communicate with remote
# executes whoami which exits pretty fast
stdin, stdout, stderr = ssh.exec_command("whoami") 
time.sleep(5)  # main thread waits 5 seconds
# command already finished, exit code already received
#  and set by the exec_command thread.
# therefore the loop condition is not met 
#  as exit_status_ready() already returns True 
#  (remember, remote command already exited and was handled by a different thread)
while not stdout.channel.exit_status_ready():
    if stdout.channel.recv_ready():
        stdoutLines = stdout.readlines()

Run Code Online (Sandbox Code Playgroud)

如果stdout在无限循环中等待stdout.channel.recv_ready()成为True 之前我怎么知道是否应该有数据(如果不应该有任何stdout输出则不会这样)？

channel.recv_ready() 只是表示缓冲区中有未读数据.

def recv_ready(self):
    """
    Returns true if data is buffered and ready to be read from this
    channel.  A ``False`` result does not mean that the channel has closed;
    it means you may need to wait before more data arrives.

Run Code Online (Sandbox Code Playgroud)

这意味着可能由于网络(延迟数据包,重新传输......)或只是您的远程进程没有stdout/stderr定期写入可能导致recv_ready为False.因此,recv_ready()作为循环条件可能会导致代码过早返回,因为它有时会产生True(当远程进程写入stdout并且您的本地通道线程收到该输出时)并且有时会产生False(例如,您的远程)在迭代中,proc正在休眠而不是写入stdout.

除此之外,人们偶尔会遇到paramiko挂起,这可能与stdout/stderr 填充缓冲区有关(与Popen和挂起过程中的问题有关,当你从未读取stdout/stderr并且内部缓冲区填满时).

下面的代码实现了一个分块解决方案,用于stdout/stderr在通道打开时清空缓冲区.

def myexec(ssh, cmd, timeout, want_exitcode=False):
  # one channel per command
  stdin, stdout, stderr = ssh.exec_command(cmd) 
  # get the shared channel for stdout/stderr/stdin
  channel = stdout.channel

  # we do not need stdin.
  stdin.close()                 
  # indicate that we're not going to write to that channel anymore
  channel.shutdown_write()      

  # read stdout/stderr in order to prevent read block hangs
  stdout_chunks = []
  stdout_chunks.append(stdout.channel.recv(len(stdout.channel.in_buffer)))
  # chunked read to prevent stalls
  while not channel.closed or channel.recv_ready() or channel.recv_stderr_ready(): 
      # stop if channel was closed prematurely, and there is no data in the buffers.
      got_chunk = False
      readq, _, _ = select.select([stdout.channel], [], [], timeout)
      for c in readq:
          if c.recv_ready(): 
              stdout_chunks.append(stdout.channel.recv(len(c.in_buffer)))
              got_chunk = True
          if c.recv_stderr_ready(): 
              # make sure to read stderr to prevent stall    
              stderr.channel.recv_stderr(len(c.in_stderr_buffer))  
              got_chunk = True  
      '''
      1) make sure that there are at least 2 cycles with no data in the input buffers in order to not exit too early (i.e. cat on a >200k file).
      2) if no data arrived in the last loop, check if we already received the exit code
      3) check if input buffers are empty
      4) exit the loop
      '''
      if not got_chunk \
          and stdout.channel.exit_status_ready() \
          and not stderr.channel.recv_stderr_ready() \
          and not stdout.channel.recv_ready(): 
          # indicate that we're not going to read from this channel anymore
          stdout.channel.shutdown_read()  
          # close the channel
          stdout.channel.close()
          break    # exit as remote side is finished and our bufferes are empty

  # close all the pseudofiles
  stdout.close()
  stderr.close()

  if want_exitcode:
      # exit code is always ready at this point
      return (''.join(stdout_chunks), stdout.channel.recv_exit_status())
  return ''.join(stdout_chunks)

Run Code Online (Sandbox Code Playgroud)

channel.closed在通道过早关闭的情况下,这只是最终的退出条件.在读取块之后,代码检查是否已经接收到exit_status并且在此期间没有缓冲新数据.如果新数据到达或没有收到exit_status,代码将继续尝试读取块.一旦远程proc退出并且缓冲区中没有新数据,我们假设我们已经阅读了所有内容并开始关闭通道.请注意,如果您想要收到退出状态,您应该一直等到收到退出状态,否则paramiko可能会永远阻止.

这样可以保证缓冲区不会填满并使proc挂起.exec_command仅在远程命令退出且本地缓冲区中没有剩余数据时才返回.通过select()在繁忙的循环中使用而不是轮询,代码也有点cpu友好,但对于短生命命令可能有点慢.

仅供参考,为了防止某些无限循环,可以设置在没有数据到达一段时间时触发的通道超时

 chan.settimeout(timeout)
 chan.exec_command(command)

Run Code Online (Sandbox Code Playgroud)

感谢您的详细解释。看来我也有同样的问题。由于某种原因，我遇到一种情况（在第一次迭代时），退出代码已准备好，但 stdout/stderr 尚未准备好，因此它甚至没有进入循环。这很奇怪——这怎么可能呢？您能多解释一下代码吗？为什么要在 while 和 if 中重复检查？while-check还不够？另外，为什么要在 while 循环之前阅读？似乎同样的操作会自动完成，因为 recv_ready() 在第一次迭代时将为 true，不是吗？另外，channel.close 没有记录，对吧？ (2认同)
为什么随机使用“channel”和“stdout.channel”？有什么原因吗，或者只是代码中的剩余部分？ (2认同)

归档时间：	11 年，6 月前
查看次数：	11061 次
最近记录：	8 年，2 月前