LLL*_*LLL 5 hadoop hadoop-yarn apache-spark pyspark
我正在尝试运行 bash 脚本来运行 Spark-submit 并运行 pyspark 脚本,但没有成功。我想使用“yarn log -applicationId”检查纱线日志。我的问题是如何找到合适的应用程序 ID?
1. Using Yarn Logs:
在日志中你可以看到tracking URL: http://<nn>:8088/proxy/application_*****/
如果复制并打开该链接,您可以在资源管理器中看到该应用程序的所有日志。
2.Using Spark application:
从sparkContext我们可以获取applicationID。
print(spark.sparkContext.aplicationId)
Run Code Online (Sandbox Code Playgroud)
3. Using yarn application command:
使用yarn application --list命令获取集群上所有正在运行的yarn应用程序,然后使用
yarn application --help
-appStates <States> Works with -list to filter applications
based on input comma-separated list of
application states. The valid application
state can be one of the following:
ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
NING,FINISHED,FAILED,KILLED
-appTypes <Types> Works with -list to filter applications
based on input comma-separated list of
application types.
-help Displays help for all commands.
-kill <Application ID> Kills the application.
-list List applications. Supports optional use
of -appTypes to filter applications based
on application type, and -appStates to
filter applications based on application
state.
-movetoqueue <Application ID> Moves the application to a different
queue.
-queue <Queue Name> Works with the movetoqueue command to
specify which queue to move an
application to.
-status <Application ID> Prints the status of the application.
Run Code Online (Sandbox Code Playgroud)
List all the finished applications:
yarn application -appStates FINISHED -list
Run Code Online (Sandbox Code Playgroud)