我是深度学习的初学者.我发现了'渐变检查'的概念.
我只是想知道,它是什么以及它如何有助于改善培训过程?
我正在尝试 Pandas UDF 并面临 IllegalArgumentException。我还尝试从 PySpark 文档GroupedData复制示例进行检查,但仍然出现错误。
以下是环境配置
from pyspark.sql.functions import pandas_udf, PandasUDFType
@pandas_udf('int', PandasUDFType.GROUPED_AGG)
def min_udf(v):
return v.min()
sorted(gdf.agg(min_udf(df.age)).collect())
Run Code Online (Sandbox Code Playgroud)
输出
Py4JJavaError Traceback (most recent call last)
<ipython-input-66-94a0a39bfe30> in <module>
----> 1 sorted(gdf.agg(min_udf(sample_data.sqft)).collect())
~/Desktop/test/venv/lib/python3.7/site-packages/pyspark/sql/dataframe.py in collect(self)
532 """
533 with SCCallSiteSync(self._sc) as css:
--> 534 sock_info = self._jdf.collectToPython()
535 return list(_load_from_socket(sock_info, BatchedSerializer(PickleSerializer())))
536
~/Desktop/test/venv/lib/python3.7/site-packages/py4j/java_gateway.py in __call__(self, *args)
1255 answer = self.gateway_client.send_command(command)
1256 return_value = get_return_value(
-> 1257 answer, self.gateway_client, …Run Code Online (Sandbox Code Playgroud)