运行pyspark时没有此类文件或目录错误

mca*_*ado 3 apache-spark

我安装了 spark 但是当我pyspark在终端上运行时,我得到了

/usr/local/Cellar/apache-spark/2.4.5_1/libexec/bin/pyspark: line 24: /Users/miguel/spark-2.3.0-bin-hadoop2.7/bin/load-spark-env.sh: No such file or directory
/usr/local/Cellar/apache-spark/2.4.5_1/libexec/bin/pyspark: line 77: /Users/miguel/spark-2.3.0-bin-hadoop2.7/bin/spark-submit: No such file or directory
/usr/local/Cellar/apache-spark/2.4.5_1/libexec/bin/pyspark: line 77: exec: /Users/miguel/spark-2.3.0-bin-hadoop2.7/bin/spark-submit: cannot execute: No such file or directory
Run Code Online (Sandbox Code Playgroud)

我试过再次卸载和安装(spark、java、scala),但它一直抛出这个错误。也在这里和 GitHub 问题上搜索过,但找不到任何有用的东西。

附加信息:

brew doctor

(myenv) C02YH1U3FSERT:~ miguel$ brew doctor
Please note that these warnings are just used to help the Homebrew maintainers
with debugging if you file an issue. If everything you use Homebrew for is
working fine: please don't worry or file an issue; just ignore this. Thanks!

Warning: "config" scripts exist outside your system or Homebrew directories.
`./configure` scripts often look for *-config scripts to determine if
software packages are installed, and which additional flags to use when
compiling and linking.

Having additional scripts in your path can confuse software installed via
Homebrew if the config script overrides a system or Homebrew-provided
script of the same name. We found the following "config" scripts:
  /Users/miguel/.pyenv/shims/python3.7-config
  /Users/miguel/.pyenv/shims/python3.7m-config
  /Users/miguel/.pyenv/shims/python-config
  /Users/miguel/.pyenv/shims/python3-config 
Run Code Online (Sandbox Code Playgroud)

brew tap

(myenv) C02YH1U3FSERT:~ miguel$ brew tap
adoptopenjdk/openjdk
homebrew/cask
homebrew/cask-versions
homebrew/core
Run Code Online (Sandbox Code Playgroud)

hadoop version

Hadoop 3.2.1
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842
Compiled by rohithsharmaks on 2019-09-10T15:56Z
Compiled with protoc 2.5.0
From source with checksum 776eaf9eee9c0ffc370bcbc1888737
This command was run using /usr/local/Cellar/hadoop/3.2.1_1/libexec/share/hadoop/common/hadoop-common-3.2.1.jar
Run Code Online (Sandbox Code Playgroud)

echo $SPARK_HOME

/Users/miguel/spark-2.3.0-bin-hadoop2.7
Run Code Online (Sandbox Code Playgroud)

hdfs dfs -ls

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 6 items
...
Run Code Online (Sandbox Code Playgroud)

我在这方面花了很多时间,如果有人能指出解决方案,那就太好了。

mar*_*ita 6

原因是因为 SPARK_HOME 仍然设置为旧路径。即使source ~/.bash_profile这个值没有被取消,直到我申请:

unset SPARK_HOME
Run Code Online (Sandbox Code Playgroud)

然后错误消失了。