The*_*mis 5 databricks azure-databricks delta-lake
我需要将数据集读入 DataFrame,然后将数据写入 Delta Lake。但我有以下例外:
AnalysisException: 'Incompatible format detected.\n\nYou are trying to write to `dbfs:/user/class@azuredatabrickstraining.onmicrosoft.com/delta/customer-data/` using Databricks Delta, but there is no\ntransaction log present. Check the upstream job to make sure that it is writing\nusing format("delta") and that you are trying to write to the table base path.\n\nTo disable this check, SET spark.databricks.delta.formatCheck.enabled=false\nTo learn more about Delta, see https://docs.azuredatabricks.net/delta/index.html\n;
Run Code Online (Sandbox Code Playgroud)
这是异常前面的代码:
from pyspark.sql.types import StructType, StructField, DoubleType, IntegerType, StringType
inputSchema = StructType([
StructField("InvoiceNo", IntegerType(), True),
StructField("StockCode", StringType(), True),
StructField("Description", StringType(), True),
StructField("Quantity", IntegerType(), True),
StructField("InvoiceDate", StringType(), True),
StructField("UnitPrice", DoubleType(), True),
StructField("CustomerID", IntegerType(), True),
StructField("Country", StringType(), True)
])
rawDataDF = (spark.read
.option("header", "true")
.schema(inputSchema)
.csv(inputPath)
)
# write to Delta Lake
rawDataDF.write.mode("overwrite").format("delta").partitionBy("Country").save(DataPath)
Run Code Online (Sandbox Code Playgroud)
Mic*_*ust 10
此错误消息告诉您目标路径中已经有数据(在本例中dbfs:/user/class@azuredatabrickstraining.onmicrosoft.com/delta/customer-data/),并且该数据不是 Delta 格式(即没有事务日志)。您可以选择一个新路径(根据上面的评论,您似乎已经这样做了)或删除该目录并重试。
| 归档时间: |
|
| 查看次数: |
6938 次 |
| 最近记录: |