连接到Mesos的Spark-shell卡在sched.cpp上

lyo*_*omi 6 mesos apache-spark

以下是我的spark-defaults.conf和输出spark-shell

$ cat conf/spark-defaults.conf
spark.master                     mesos://172.16.**.***:5050
spark.eventLog.enabled           false
spark.broadcast.compress         false
spark.driver.memory              4g
spark.executor.memory            4g
spark.executor.instances         1

$ bin/spark-shell
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.5.2
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80)
Type in expressions to have them evaluated.
Type :help for more information.
15/11/15 04:56:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
I1115 04:56:12.171797 72994816 sched.cpp:164] Version: 0.25.0
I1115 04:56:12.173741 67641344 sched.cpp:262] New master detected at master@172.16.**.***:5050
I1115 04:56:12.173951 67641344 sched.cpp:272] No credentials provided. Attempting to register without authentication
Run Code Online (Sandbox Code Playgroud)

它无限期地悬挂在这里,而Mesos Web UI显示许多Spark框架正在旋转 - 连续注册和取消注册,直到我退出spark-shellCtrl-C.

Mesos Web UI

我怀疑这部分是由于我的笔记本电脑有多个IP地址造成的.当在服务器上运行时,它继续到下一行,并且通常是Scala REPL:

I1116 09:53:30.265967 29327 sched.cpp:641] Framework registered with 9d725348-931a-48fb-96f7-d29a4b09f3e8-0242
15/11/16 09:53:30 INFO mesos.MesosSchedulerBackend: Registered as framework ID 9d725348-931a-48fb-96f7-d29a4b09f3e8-0242
15/11/16 09:53:30 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57810.
15/11/16 09:53:30 INFO netty.NettyBlockTransferService: Server created on 57810
15/11/16 09:53:30 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/11/16 09:53:30 INFO storage.BlockManagerMasterEndpoint: Registering block manager 172.16.**.***:57810 with 2.1 GB RAM, BlockManagerId(driver, 172.16.**.***, 57810)
15/11/16 09:53:30 INFO storage.BlockManagerMaster: Registered BlockManager
15/11/16 09:53:30 INFO repl.Main: Created spark context..
Spark context available as sc.
Run Code Online (Sandbox Code Playgroud)

我正在运行由Mesosphere构建的Mesos 0.25.0,我正在设置spark.driver.host可从Mesos集群中的所有机器访问的地址.我看到每个spark-shell进程打开的端口都绑定到该IP地址或者绑定到*.

StackOverflow最相似的问题似乎没有用,因为在这种情况下我的笔记本电脑应该可以从主机访问.

我找不到可能包含框架未注册原因的日志文件.我应该在哪里寻找解决此问题的方法?

Ste*_*ker 5

Mesos有一个关于网络如何工作的非常奇怪的概念 - 特别是,您可以在Master和Framework之间建立双向通信.所以双方都需要有一条共同的网络路线.如果你在NAT或容器后面运行,你之前就遇到过这种情况 - 通常你需要LIBPROCESS_IP在Framework端设置为可公开访问的IP.也许这适用于多宿主系统,就像你的笔记本电脑一样.

您可以在互联网上找到更多信息,但遗憾的是,没有详细记录.但是他们的Deployment Scripts页面上一个提示.