jav*_*dba 3 apache-spark spark-structured-streaming
对于Spark 结构化流式读取过程:
sdf.writeStream
.outputMode(outputMode)
.format("console")
.trigger(Trigger.ProcessingTime("2 seconds"))
.start())
Run Code Online (Sandbox Code Playgroud)
正确地编写了其format(console)输出,如下所示:
Batch: 3
+----------+------+-------+-----------------+
|OnTimeRank|Origin|Carrier| OnTimePct|
+----------+------+-------+-----------------+
| 1| BWI| EV| 90.0|
| 2| BWI| US|88.54072251715655|
| 3| BWI| CO|88.52097130242826|
| 4| BWI| YV| 87.2168284789644|
| 5| BWI| DL|86.21888471700737|
| 6| BWI| NW|86.04866030181707|
| 7| BWI| 9E|85.83545377438507|
| 8| BWI| AA|85.71428571428571|
| 9| BWI| FL|83.25366684127816|
| 10| BWI| UA|81.32427843803056|
| 1| CMI| MQ|81.92159607980399|
| 1| IAH| NW| 91.6242895602752|
| 2| IAH| F9|88.62350722815839|
| 3| IAH| US|87.54764930114358|
| 4| IAH| 9E|84.33613445378151|
| 5| IAH| OO| 84.2836946277097|
| 6| IAH| DL|83.46420323325636|
| 7| IAH| UA|83.40671436433682|
| 8| IAH| XE|81.35189010909355|
| 9| IAH| OH|80.61558611656844|
+----------+------+-------+-----------------+
Run Code Online (Sandbox Code Playgroud)
但这只是结果的一部分。是否有相当于 dataframe.show(NumRows, truncate)viaoption设置 - 类似于.option("maxRows",1000):
sdf.writeStream
.outputMode(outputMode)
.format("console")
.option("maxRows",1000) // This is what I want but not sure how to do
.trigger(Trigger.ProcessingTime("2 seconds"))
.start())
Run Code Online (Sandbox Code Playgroud)
该选项称为numRows例如.option("numRows",1000)
| 归档时间: |
|
| 查看次数: |
1665 次 |
| 最近记录: |