Apache Spark 数据集 API：head(n:Int) 与 take(n:Int)

Question

Apache Spark 数据集 API：head(n:Int) 与 take(n:Int)

Kri*_*ddy 6 apache-spark apache-spark-sql spark-dataframe

Apache Spark 数据集 API 有两种方法，即head(n:Int)和take(n:Int)。

Dataset.Scala 源包含

def take(n: Int): Array[T] = head(n)

Run Code Online (Sandbox Code Playgroud)

找不到这两个函数之间执行代码的任何差异。为什么 API 有两种不同的方法来产生相同的结果？

Answer 1

Lui*_*uis 3

在我看来，原因是 Apache Spark Dataset API 试图模仿 Pandas DataFrame API，其中包含head https://pandas.pydata.org/pandas-docs/stable/ generated/pandas.DataFrame.head.html 。

归档时间：	8 年，7 月前
查看次数：	7108 次
最近记录：	6 年前