为什么 client.recv(1024) 在这个简单的 WebSocket 服务器实现中返回一个空字节文字?

Viz*_*ary 6 javascript python websocket

我需要在气隙网络上的 Python 和 JavaScript 之间进行 Web 套接字客户端服务器交换,所以我只能阅读和输入的内容(相信我,我很想能够运行pip install websockets)。这是 Python 和 JavaScript 之间的基本 RFC 6455 WebSocket 客户端-服务器关系。在代码下方,我将指出client.recv(1024)返回空字节文字的特定问题,导致 WebSocket 服务器实现中止连接。

客户:

<script>
    const message = { 
        name: "ping",
        data: 0
    }
    const socket = new WebSocket("ws://localhost:8000")
    socket.addEventListener("open", (event) => {
        console.log("socket connected to server")
        socket.send(JSON.stringify(message))
    })
    socket.addEventListener("message", (event) => {
        console.log("message from socket server:", JSON.parse(event))
    })
</script>
Run Code Online (Sandbox Code Playgroud)

服务器,在此处找到(RFC 6455 的最小实现)

import array
import time
import socket
import hashlib
import sys
from select import select
import re
import logging
from threading import Thread
import signal
from base64 import b64encode

class WebSocket(object):
    handshake = (
        "HTTP/1.1 101 Web Socket Protocol Handshake\r\n"
        "Upgrade: WebSocket\r\n"
        "Connection: Upgrade\r\n"
        "WebSocket-Origin: %(origin)s\r\n"
        "WebSocket-Location: ws://%(bind)s:%(port)s/\r\n"
        "Sec-Websocket-Accept: %(accept)s\r\n"
        "Sec-Websocket-Origin: %(origin)s\r\n"
        "Sec-Websocket-Location: ws://%(bind)s:%(port)s/\r\n"
        "\r\n"
    )
    def __init__(self, client, server):
        self.client = client
        self.server = server
        self.handshaken = False
        self.header = ""
        self.data = ""

    def feed(self, data):
        if not self.handshaken:
            self.header += str(data)
            if self.header.find('\\r\\n\\r\\n') != -1:
                parts = self.header.split('\\r\\n\\r\\n', 1)
                self.header = parts[0]
                if self.dohandshake(self.header, parts[1]):
                    logging.info("Handshake successful")
                    self.handshaken = True
        else:
            self.data += data.decode("utf-8", "ignore")
            playloadData = data[6:]
            mask = data[2:6]
            unmasked = array.array("B", playloadData)
            for i in range(len(playloadData)):
                unmasked[i] = unmasked[i] ^ mask[i % 4]
            self.onmessage(bytes(unmasked).decode("utf-8", "ignore"))

    def dohandshake(self, header, key=None):
        logging.debug("Begin handshake: %s" % header)
        digitRe = re.compile(r'[^0-9]')
        spacesRe = re.compile(r'\s')
        part = part_1 = part_2 = origin = None
        for line in header.split('\\r\\n')[1:]:
            name, value = line.split(': ', 1)
            if name.lower() == "sec-websocket-key1":
                key_number_1 = int(digitRe.sub('', value))
                spaces_1 = len(spacesRe.findall(value))
                if spaces_1 == 0:
                    return False
                if key_number_1 % spaces_1 != 0:
                    return False
                part_1 = key_number_1 / spaces_1
            elif name.lower() == "sec-websocket-key2":
                key_number_2 = int(digitRe.sub('', value))
                spaces_2 = len(spacesRe.findall(value))
                if spaces_2 == 0:
                    return False
                if key_number_2 % spaces_2 != 0:
                    return False
                part_2 = key_number_2 / spaces_2
            elif name.lower() == "sec-websocket-key":
                part = bytes(value, 'UTF-8')
            elif name.lower() == "origin":
                origin = value
        if part:
            sha1 = hashlib.sha1()
            sha1.update(part)
            sha1.update("258EAFA5-E914-47DA-95CA-C5AB0DC85B11".encode('utf-8'))
            accept = (b64encode(sha1.digest())).decode("utf-8", "ignore")
            handshake = WebSocket.handshake % {
                'accept': accept,
                'origin': origin,
                'port': self.server.port,
                'bind': self.server.bind
            }
            #handshake += response
        else:
            logging.warning("Not using challenge + response")
            handshake = WebSocket.handshake % {
                'origin': origin,
                'port': self.server.port,
                'bind': self.server.bind
            }
        logging.debug("Sending handshake %s" % handshake)
        self.client.send(bytes(handshake, 'UTF-8'))
        return True

    def onmessage(self, data):
        logging.info("Got message: %s" % data)

    def send(self, data):
        logging.info("Sent message: %s" % data)
        self.client.send("\x00%s\xff" % data)

    def close(self):
        self.client.close()

class WebSocketServer(object):
    def __init__(self, bind, port, cls):
        self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.socket.bind((bind, port))
        self.bind = bind
        self.port = port
        self.cls = cls
        self.connections = {}
        self.listeners = [self.socket]

    def listen(self, backlog=5):
        self.socket.listen(backlog)
        logging.info("Listening on %s" % self.port)
        self.running = True
        while self.running:
            # upon first connection rList = [784] and the other two are empty
            rList, wList, xList = select(self.listeners, [], self.listeners, 1)
            for ready in rList:
                if ready == self.socket:
                    logging.debug("New client connection")
                    client, address = self.socket.accept()
                    fileno = client.fileno()
                    self.listeners.append(fileno)
                    self.connections[fileno] = self.cls(client, self)
                else:
                    logging.debug("Client ready for reading %s" % ready)
                    client = self.connections[ready].client
                    data = client.recv(1024) # currently, this results in: b''
                    fileno = client.fileno()
                    if data: # data = b''
                        self.connections[fileno].feed(data)
                    else:
                        logging.debug("Closing client %s" % ready)
                        self.connections[fileno].close()
                        del self.connections[fileno]
                        self.listeners.remove(ready)
            for failed in xList:
                if failed == self.socket:
                    logging.error("Socket broke")
                    for fileno, conn in self.connections:
                        conn.close()
                    self.running = False

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG, 
        format="%(asctime)s - %(levelname)s - %(message)s")
    server = WebSocketServer("localhost", 8000, WebSocket)
    server_thread = Thread(target=server.listen, args=[5])
    server_thread.start()
    # Add SIGINT handler for killing the threads
    def signal_handler(signal, frame):
        logging.info("Caught Ctrl+C, shutting down...")
        server.running = False
        sys.exit()
    signal.signal(signal.SIGINT, signal_handler)
    while True:
        time.sleep(100)
Run Code Online (Sandbox Code Playgroud)

服务器端日志:

INFO - Hanshake successful
DEBUG - Client ready for reading 664
DEBUG - Closing client 664
Run Code Online (Sandbox Code Playgroud)

在客户端我得到

WebSocket connection to 'ws://localhost:8000' failed: Unknown Reason
Run Code Online (Sandbox Code Playgroud)

问题是在这里跟踪:

if data:
    self.connections[fileno].feed(data)
else: # this is being triggered on the server side 
    logging.debug("Closing client %s" % ready)
Run Code Online (Sandbox Code Playgroud)

所以研究这个,我发现一个潜在的问题Python文档中select用于检索rlistwlistxlist

select.select(rlist, wlist, xlist[, timeout]) 这是 Unixselect()系统调用的直接接口。前三个参数是“可等待对象”的可迭代对象:表示文件描述符的整数或具有名为fileno()返回此类整数的无参数方法的对象:

rlist: 等到准备好阅读

wlist: 等到准备好写入

xlist: 等待“异常情况”(有关您的系统认为这种情况的内容,请参阅手册页)

看到该功能基于 Unix 系统调用,我意识到此代码可能不支持 Windows,这是我的环境。我查了值rlistwlistxlist并发现他们都空名单上的第一次迭代rList = [784](或其他号码,如664),另两个是空的,在此之后,连接被关闭。

文档继续指出:

注意: Windows 上的文件对象是不可接受的,但套接字是。在 Windows 上,底层的 select() 函数由 WinSock 库提供,并且不处理不是源自 WinSock 的文件描述符。

但我不清楚这的确切含义。

所以在代码逻辑中,我做了一些日志记录并在这里跟踪了问题:

rList, wList, xList = select(self.listeners, [], self.listeners, 1)
    for ready in rList: # rList = [836] or some other number
        # and then we check if ready (so the 836 int) == self.socket
        # but if we log self.socket we get this:
        # <socket.socket fd=772, family=AddressFamily.AF_INET, 
        # type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 8000)>
        # so of course an integer isn't going to be equivalent to that
        if ready == self.socket:
            logging.debug("New client connection")
            #so lets skip this code and see what the other condition does
        else:
            logging.debug("Client ready for reading %s" % ready)
            client = self.connections[ready].client
            data = client.recv(1024) # currently, this results in: b''
            fileno = client.fileno()
            if data: # data = b'', so this is handled as falsy
                self.connections[fileno].feed(data)
            else:
                logging.debug("Closing client %s" % ready)
            
Run Code Online (Sandbox Code Playgroud)

至于为什么client.recv(1024)返回一个空的二进制字符串,我不知道。我不知道是否rList应该包含一个以上的整数,或者协议是否按预期工作直到recv

任何人都可以解释是什么导致了.recv这里的通话中断?客户端 JavaScript WebSocket 协议是否不发送任何应预期的数据?或者是 WebSocket 服务器有问题,它有什么问题?

小智 1

我尝试运行你的示例,它似乎按预期工作。至少服务器日志以以下行结尾:

INFO - Got message: {"name":"ping","data":0}
Run Code Online (Sandbox Code Playgroud)

我的环境:

  • 操作系统:Arch Linux;
  • WebSocket 客户端:Chromium/85.0.4183.121 运行您提供的 JS 代码;
  • WebSocket 服务器:运行您提供的 Python 代码的 Python/3.8.5;

select.select文档字符串确实指出

在 Windows 上,仅支持套接字

但很可能操作系统是无关紧要的,因为服务器代码仅使用套接字作为select.select参数。

recv当套接字的读取端关闭时,返回一个空字节字符串。来自recv(3)男人:

如果没有消息可供接收并且对等方已执行有序关闭,则 recv() 应返回 0。

有趣的是,您在服务器日志中收到了一条有关成功握手的消息:

INFO - Hanshake successful
Run Code Online (Sandbox Code Playgroud)

这意味着在您的情况下,客户端和服务器之间的连接已经建立,并且一些数据已经双向传输。之后套接字被关闭。查看服务器代码,我发现服务器没有理由停止连接。所以我认为你使用的客户端应该受到责备。

要准确找出问题所在,请尝试使用tcpdump或拦截网络流量wireshark,然后运行以下 Python WebSocket 客户端脚本,该脚本会重现我的浏览器在测试时执行的操作:

INFO - Got message: {"name":"ping","data":0}
Run Code Online (Sandbox Code Playgroud)