线程“main”中的异常java.io.IOException:无法运行程序“”:错误=2,没有这样的文件或目录

Yu-*_*LIN 1 java python-3.x apache-spark

我在执行以下脚本时遇到了这个问题

./spark-submit /home/*****/public_html/****/****.py

我首先使用 python3.7.2 和后来的 python3.5.2 但仍然收到以下错误消息。

Exception in thread "main" java.io.IOException: Cannot run program "": error=2, No such a file or directory.
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
    at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:100)
    at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: error=2, No such a file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
    at java.lang.ProcessImpl.start(ProcessImpl.java:134)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)... 12 more`
Run Code Online (Sandbox Code Playgroud)

在它之前,我有几个消息输出为

2019-02-07 11:30:18 WARN  Utils:66 - Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using xxx.xxx.xxx.xxx instead (on interface eth0)
2019-02-07 11:30:18 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2019-02-07 11:30:19 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Run Code Online (Sandbox Code Playgroud)

我能够执行 python3 -V 我能够启动 spark-shell 和 pyspark

而且我觉得很奇怪,“”之间没有显示任何消息。

对于我的 python 代码,它以

import sys
import urllib3
import requests

from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql.types import StructType, StructField
from pyspark.sql.types import DoubleType, IntegerType, StringType

from CommonFunctions import *
from LanguageCodeParser import *
Run Code Online (Sandbox Code Playgroud)

我还尝试了一个非常简单的 python 代码

print("This is a test.")
Run Code Online (Sandbox Code Playgroud)

这是执行后的一些消息 bash -x spark-submit test.py

+ '[' -z /opt/spark-2.3.2-bin-hadoop2.7 ']'
+ export PYTHONHASHSEED=0
+ PYTHONHASHSEED=0
+ exec /opt/spark-2.3.2-bin-hadoop2.7/bin/spark-class org.apache.spark.deploy.SparkSubmit test.py
Run Code Online (Sandbox Code Playgroud)

但是,它不起作用。提前感谢您的帮助。

Yu-*_*LIN 9

我发现设置PYSPARK_PYTHON=/usr/bin/python3很有用

如果这个环境变量可以设置在

/opt/spark-2.3.2-bin-hadoop2.7/conf/spark-env.sh