M. *_*adi 10 windows anaconda apache-spark pyspark jupyter-notebook
我使用 jupyter Notebook 的 PySpark 内核,我已成功选择 PySpark 内核,但我不断收到以下错误
代码因致命错误而失败:发送 http 请求时出错并遇到最大重试次数。要尝试的一些操作:
a) 确保 Spark 有足够的可用资源供 Jupyter 创建 Spark 上下文。
b) 请联系您的 Jupyter 管理员以确保 Spark magics 库配置正确。
c) 重新启动内核。
这也是日志
2019-10-10 13:37:43,741 DEBUG SparkMagics Initialized spark magics.
2019-10-10 13:37:43,742 INFO EventsHandler InstanceId: 32a21583-6879-4ad5-88bf-e07af0b09387,EventName: notebookLoaded,Timestamp: 2019-10-10 10:37:43.742475
2019-10-10 13:37:43,744 DEBUG python_jupyter_kernel Loaded magics.
2019-10-10 13:37:43,744 DEBUG python_jupyter_kernel Changed language.
2019-10-10 13:37:44,356 DEBUG python_jupyter_kernel Registered auto viz.
2019-10-10 13:37:45,440 INFO EventsHandler InstanceId: 32a21583-6879-4ad5-88bf-e07af0b09387,EventName: notebookSessionCreationStart,Timestamp: 2019-10-10 10:37:45.440323,SessionGuid: d230b1f3-6bb1-4a66-bde1-7a73a14d7939,LivyKind: pyspark
2019-10-10 13:37:49,591 ERROR ReliableHttpClient Request to 'http://localhost:8998/sessions' failed with 'HTTPConnectionPool(host='localhost', port=8998): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000013184159808>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))'
2019-10-10 13:37:49,591 INFO EventsHandler InstanceId: 32a21583-6879-4ad5-88bf-e07af0b09387,EventName: notebookSessionCreationEnd,Timestamp: 2019-10-10 10:37:49.591650,SessionGuid: d230b1f3-6bb1-4a66-bde1-7a73a14d7939,LivyKind: pyspark,SessionId: -1,Status: not_started,Success: False,ExceptionType: HttpClientException,ExceptionMessage: Error sending http request and maximum retry encountered.
2019-10-10 13:37:49,591 ERROR SparkMagics Error creating session: Error sending http request and maximum retry encountered.
Run Code Online (Sandbox Code Playgroud)
请注意,我正在尝试在 Windows 上配置它。多谢
小智 7
我遇到了同样的问题,您可以通过不使用 PySpark 内核(笔记本)而是使用 Python 3 内核(笔记本)来解决它。我使用以下代码来设置 Spark 集群:
import pyspark # only run after findspark.init()
from pyspark.sql import SparkSession
# May take awhile locally
spark = SparkSession.builder.appName("test").getOrCreate()
spark
Run Code Online (Sandbox Code Playgroud)
如果您尝试通过 Livy 将 Jupyter Notebook 连接到 Spark 服务器(例如 AWS Glue 开发终端节点),则必须在 ~/.sparkmagic/config.json 中将“localhost”替换为 Spark 服务器 IP 地址
如此处所述: https: //aws.amazon.com/blogs/machine-learning/build-amazon-sagemaker-notebooks-backed-by-spark-in-amazon-emr/
| 归档时间: |
|
| 查看次数: |
25411 次 |
| 最近记录: |