Docker容器中本地模式的PySpark异常异常

Sco*_*ieh 5 macos docker pyspark

我在test.py中有一个想要执行的Spark应用程序。简单来说,我安装了Spark 2.3.0,然后想执行test.py。当我在我的开发机器Mac Book 上这样做时,一切都很好。但是当我尝试在我的Mac Book上的Docker容器中做同样的事情时,我会遇到以下异常,我在没有胜利的情况下从谷歌搜索了可能的提示。

目前收集到的线索

  1. 在我的笔记本电脑上运行没有Docker 的test.py绝对没问题。
  2. 保存的时候好像跟记录号有关系。
    I. 三个动作的逻辑都差不多。在Docker容器中,第一个动作执行得很好。但是,当Spark尝试将结果保存到AWS S3时,接下来的两个每次都会失败。
    二、如果我在最后 2 个动作(如 5000)中将数据帧限制为输出,则程序将执行得很好。

Spark应用逻辑

从 MySQL 或 S3 检索数据,进行一些计算,然后将结果存储到AWS S3 上

异常信息

Exception happened during processing of request from ('127.0.0.1', 52218)
2018-04-24 06:17:19,572 678 py4j.java_gateway INFO:Error while receiving.
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 1062, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
2018-04-24 06:17:19,626 678 root ERROR:Exception while sending command.
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 1062, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 908, in send_command
    response = connection.send_command(command)
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 1067, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
Traceback (most recent call last):
  File "py/apmain.py", line 36, in <module>
  File "/usr/lib64/python3.6/socketserver.py", line 317, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib64/python3.6/socketserver.py", line 348, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib64/python3.6/socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python3.6/socketserver.py", line 696, in __init__
    self.handle()
  File "/opt/spark/python/pyspark/accumulators.py", line 235, in handle
    num_updates = read_int(self.rfile)
  File "/opt/spark/python/pyspark/serializers.py", line 685, in read_int
    raise EOFError
EOFError
----------------------------------------
    main()
  File "py/apmain.py", line 30, in main
    engine_main.main(sys.argv)
  File "/opt/ap2126/py/dtt/ml/framework/engine_main.py", line 46, in main
    raise e
  File "/opt/ap2126/py/dtt/ml/framework/engine_main.py", line 36, in main
    engine.execute(tags)
  File "/opt/ap2126/py/dtt/ml/framework/engine.py", line 152, in execute
    passing_datas = layer_handler.execute(passing_datas, tags, is_load_cache=True)
  File "/opt/ap2126/py/dtt/ml/framework/layer_handler.py", line 319, in execute
    is_load_cache)
  File "/opt/ap2126/py/dtt/ml/framework/layer_handler.py", line 87, in _execute_workers_as_sequence
    output_data = worker_handler.worker.do_job(input_datas, results)
  File "/opt/ap2126/py/ap2126/label_extraction.py", line 498, in do_job
    self._get_lbls_from_structured_job('mysql')
  File "/opt/ap2126/py/ap2126/label_extraction.py", line 583, in _get_lbls_from_structured_job
    .json(job_save_temp_path)
  File "/opt/spark/python/pyspark/sql/readwriter.py", line 775, in json
    self._jwrite.json(path)
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 1160, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/opt/spark/python/pyspark/sql/utils.py", line 63, in deco
    return f(*a, **kw)
  File "/usr/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o818.json
2018-04-24 06:17:23,151 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,152 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,153 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,154 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,155 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,156 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,156 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,158 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,159 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,159 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,216 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,217 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,218 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,219 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,220 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,222 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
2018-04-24 06:17:23,223 678 py4j.java_gateway ERROR:An error occurred while trying to connect to the Java server (127.0.0.1:38899)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 852, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/py4j/java_gateway.py", line 990, in start
    self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
Run Code Online (Sandbox Code Playgroud)

Dockerfile 的一部分

    .
    .
    FROM centos:centos7
    .
    .
    RUN yum -y install https://centos7.iuscommunity.org/ius-release.rpm && \
        yum -y install python36u python36u-pip python36u-devel && \
        python3.6 -m pip install --upgrade pip && \
        echo "export PYTHONIOENCODING=utf-8" >> ~/.bashrc && \
        echo "alias python='python3'" >> ~/.bashrc && \
        # Installation of the Java Runtime Environment
        su -c "yum -y install java-1.8.0-openjdk" && \
        yum -y install mlocate; updatedb && \
        echo "export JAVA_HOME=\"$(locate bin/java | grep jvm | sed 's+/bin/java++g')\"" >> ~/.bashrc && \
        # Installation of Spark for running `pyspark` with success
        curl -s http://ftp.twaren.net/Unix/Web/apache/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz  | tar -zx -C /opt/ && \
        ln -s /opt/spark-2.3.0-bin-hadoop2.7 /opt/spark && \
        echo "export SPARK_HOME=/opt/spark >> ~/.bashrc"; source ~/.bashrc  && \
        echo "export PYTHONPATH="$SPARK_HOME"/python/" >> ~/.bashrc; source ~/.bashrc && \
        echo "export PYTHONPATH="$PYTHONPATH":./py/" >> ~/.bashrc && \
        echo "export PYSPARK_PYTHON=/usr/bin/python3" >> ~/.bashrc; source ~/.bashrc && \
        # For the interaction with AWS S3 on Spark
        curl -s http://central.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.3/hadoop-aws-2.7.3.jar -o ${SPARK_HOME}/jars/hadoop-aws-2.7.3.jar && \
        curl -s http://central.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar -o ${SPARK_HOME}/jars/aws-java-sdk-1.7.4.jar
    .
    .
    .
Run Code Online (Sandbox Code Playgroud)

我如何运行 Docker

docker run -it --rm ${image id}

Mar*_*ado 3

这是由于 docker 守护进程由于运行容器的内存限制而杀死了 pyspark 进程的同伴 java 进程。python 对应部分检测到套接字关闭并抛出 EOF。当尝试重新连接到死进程时会启动其他异常。

您应该检查进程内存消耗(特别是实际执行大部分 Spark 操作的 java 进程),并根据运行容器时的要求调整内存设置(--memory、--memory-swap)(https://docs. docker.com/config/containers/resource_constraints/)。