为什么./bin/spark-shell给出WARN NativeCodeLoader:无法为您的平台加载native-hadoop库?

Jac*_*ski 23 hadoop apache-spark

在Mac OS X上,我使用以下命令从源代码编译Spark:

jacek:~/oss/spark
$ SPARK_HADOOP_VERSION=2.4.0 SPARK_YARN=true SPARK_HIVE=true SPARK_GANGLIA_LGPL=true xsbt
...

[info] Set current project to root (in build file:/Users/jacek/oss/spark/)
> ; clean ; assembly
...
[info] Packaging /Users/jacek/oss/spark/examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop2.4.0.jar ...
[info] Done packaging.
[info] Done packaging.
[success] Total time: 1964 s, completed May 9, 2014 5:07:45 AM
Run Code Online (Sandbox Code Playgroud)

当我开始时,./bin/spark-shell我注意到以下WARN消息:

WARN NativeCodeLoader:无法为您的平台加载native-hadoop库...使用适用的builtin-java类

可能是什么问题?

jacek:~/oss/spark
$ ./bin/spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
14/05/09 21:11:17 INFO SecurityManager: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/05/09 21:11:17 INFO SecurityManager: Changing view acls to: jacek
14/05/09 21:11:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jacek)
14/05/09 21:11:17 INFO HttpServer: Starting HTTP Server
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.0.0-SNAPSHOT
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0)
Type in expressions to have them evaluated.
Type :help for more information.
...
14/05/09 21:11:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
...
Run Code Online (Sandbox Code Playgroud)

Jac*_*ski 19

支持的本机库平台 Apache Hadoop中的指南文档如下:

仅在*nix平台上支持本机hadoop库.该库不适用于Cygwin或Mac OS X平台.

本机hadoop库主要用于GNU/Linus平台,并已在这些发行版上进行了测试:

  • RHEL4/Fedora的
  • Ubuntu的
  • Gentoo的

在所有上述发行版中,32/64位本机hadoop库将与相应的32/64位jvm一起使用.

看起来应该在Mac OS X上忽略WARN消息,因为该平台不仅存在本机库.

  • 在适用的地方使用"builtin-java类"是不是很糟糕?我应该努力安装hadoop本机库(我已经在Linux上了) (2认同)

小智 6

以我的经验,如果您cd进入/sparkDir/conf并将其重命名spark-env.sh.templatespark-env.sh,然后设置JAVA_OPTShadoop_DIR,则可以正常工作。

您还必须编辑以下/etc/profile行:

export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native/:$LD_LIBRARY_PATH
Run Code Online (Sandbox Code Playgroud)