相关疑难解决方法(0)

wordcount示例中的Spark指标

我阅读了Spark 网站上的Metrics部分.我希望在wordcount示例上尝试它,我无法使它工作.

spark/conf/metrics.properties:

# Enable CsvSink for all instances
*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink

# Polling period for CsvSink
*.sink.csv.period=1

*.sink.csv.unit=seconds

# Polling directory for CsvSink
*.sink.csv.directory=/home/spark/Documents/test/

# Worker instance overlap polling period
worker.sink.csv.period=1

worker.sink.csv.unit=seconds

# Enable jvm source for instance master, worker, driver and executor
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource

worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource

driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource

executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
Run Code Online (Sandbox Code Playgroud)

我在本地运行我的应用程序,如文档中所示:

$SPARK_HOME/bin/spark-submit   --class "SimpleApp"   --master local[4]   target/scala-2.10/simple-project_2.10-1.0.jar
Run Code Online (Sandbox Code Playgroud)

我检查了/ home/spark/Documents/test /它是空的.

我错过了什么?


贝壳:

$SPARK_HOME/bin/spark-submit   --class "SimpleApp"   --master local[4]  --conf   spark.metrics.conf=/home/spark/development/spark/conf/metrics.properties  target/scala-2.10/simple-project_2.10-1.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Using …
Run Code Online (Sandbox Code Playgroud)

metrics apache-spark

6
推荐指数
1
解决办法
4888
查看次数

Spark Metrics:如何访问 executor 和 worker 数据?

注意:我在 YARN 上使用 Spark

我一直在尝试在 Spark 中实现的度量系统。我启用了 ConsoleSink 和 CsvSink,并为所有四个实例(驱动程序、主机、执行程序、工作程序)启用了 JvmSource。但是,我只有驱动程序输出,控制台和 csv 目标目录中没有工作程序/执行程序/主数据。

阅读完这个问题后,我想知道在提交工作时是否必须向执行者发送一些东西。

我的提交命令: ./bin/spark-submit --class org.apache.spark.examples.SparkPi lib/spark-examples-1.5.0-hadoop2.6.0.jar 10

波纹管是我的metric.properties文件:

# Enable JmxSink for all instances by class name
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink

# Enable ConsoleSink for all instances by class name
*.sink.console.class=org.apache.spark.metrics.sink.ConsoleSink

# Polling period for ConsoleSink
*.sink.console.period=10

*.sink.console.unit=seconds

#######################################
# worker instance overlap polling period
worker.sink.console.period=5

worker.sink.console.unit=seconds
#######################################

# Master instance overlap polling period
master.sink.console.period=15

master.sink.console.unit=seconds

# Enable CsvSink for all instances
*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
#driver.sink.csv.class=org.apache.spark.metrics.sink.CsvSink …
Run Code Online (Sandbox Code Playgroud)

monitoring metrics hadoop-yarn apache-spark

6
推荐指数
1
解决办法
5861
查看次数

标签 统计

apache-spark ×2

metrics ×2

hadoop-yarn ×1

monitoring ×1