Spark 1.6.2(apache)Kafka 2.1.1(CDH 5.7.2)
val conf = new SparkConf().setAppName("Test").setMaster("local[2]")
val sc = new SparkContext(conf)
val ssc = new StreamingContext(sc, Seconds(15))
val kafkaDStream = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topics)
Run Code Online (Sandbox Code Playgroud)
我正在尝试使用Spark Streaming来消息来自Kafka的消息.但是当程序运行一段时间后,我得到以下信息,然后总是输出INFO SimpleConsumer:由于套接字错误重新连接:java.nio.channels.ClosedChannelException:
16/10/08 18:28:00 INFO SimpleConsumer: Reconnect due to socket error: java.nio.channels.ClosedChannelException
16/10/08 18:28:00 INFO JobScheduler: Added jobs for time 1475922480000 ms
16/10/08 18:28:15 INFO JobScheduler: Added jobs for time 1475922495000 ms
16/10/08 18:28:30 INFO JobScheduler: Added jobs for time 1475922510000 ms
16/10/08 18:28:45 INFO JobScheduler: Added jobs for time …Run Code Online (Sandbox Code Playgroud) 我正在尝试通过 beeline 连接 Spark thrift 服务器,并且我启动了 Spark thrift,如下所示:
start-thriftserver.sh --master yarn-client --num-executors 2 --conf spark.driver.memory=2g --executor-memory 3g
Run Code Online (Sandbox Code Playgroud)
和spark conf/hive-site.xml如下:
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:node001:3306/hive?useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
</property>
<property>
<name>hive.server2.thrift.client.user</name>
<value>root</value>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>123456</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10001</value>
</property>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description>enableor disable the hive clientauthorization</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
当我使用beeline cli访问spark thrift服务器时,它提示我输入用户名和密码,但我没有输入任何内容,只需单击回车(无需输入hive-site.xml中配置的用户名和密码),我也可以访问火花。如何才能使配置良好地工作。
谢谢 :)
beeline> !connect jdbc:hive2://node001:10001
Connecting to jdbc:hive2://node001:10001
Enter username for jdbc:hive2://node001:10001:
Enter password for …Run Code Online (Sandbox Code Playgroud)