尝试读取databricks 社区版集群中的增量日志文件。(databricks-7.2 版本)
df=spark.range(100).toDF("id")
df.show()
df.repartition(1).write.mode("append").format("delta").save("/user/delta_test")
Run Code Online (Sandbox Code Playgroud)
with open('/user/delta_test/_delta_log/00000000000000000000.json','r') as f:
for l in f:
print(l)
Run Code Online (Sandbox Code Playgroud)
Getting file not found error:
FileNotFoundError: [Errno 2] No such file or directory: '/user/delta_test/_delta_log/00000000000000000000.json'
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<command-1759925981994211> in <module>
----> 1 with open('/user/delta_test/_delta_log/00000000000000000000.json','r') as f:
2 for l in f:
3 print(l)
FileNotFoundError: [Errno 2] No such file or directory: '/user/delta_test/_delta_log/00000000000000000000.json'
Run Code Online (Sandbox Code Playgroud)
我尝试添加/dbfs/,但dbfs:/没有解决,仍然出现相同的错误。
with open('/dbfs/user/delta_test/_delta_log/00000000000000000000.json','r') as f:
for l in f:
print(l)
Run Code Online (Sandbox Code Playgroud)
但是使用dbutils.fs.head我能够读取文件。 …
apache-spark pyspark databricks dbutils databricks-community-edition