Python Azure Databrick:“DataFrame”对象不支持项目分配

use*_*224 4 python azure python-3.x azure-databricks

我正在研究 Azure Databrick。我在笔记本上运行 python 脚本并从 SQL 获取数据。我尝试将日期时间列拆分为日期和时间列。这是 python 的语法:

    pushdown_query = "(SELECT * FROM STAGE.OutagesAndInterruptions) int_alias"
    df = spark.read.jdbc(url=jdbcUrl, table=pushdown_query, properties=connectionProperties)

    df['INTERRUPTION_DATE']=df['INTERRUPTION_TIME'].dt.date
Run Code Online (Sandbox Code Playgroud)

df['INTERRUPTION_TIME'] 看起来像:

+-------------------+
|  INTERRUPTION_TIME|
+-------------------+
|1997-05-12 09:57:00|
|1998-03-08 13:00:00|
|1998-02-26 13:00:00|
|1998-02-26 13:00:00|
|1998-03-03 10:04:00|
|1998-05-20 09:27:00|
|1998-11-21 08:51:00|
|1998-11-27 08:44:00|
|1998-10-19 01:19:00|
|1998-10-19 01:44:00|
|2000-03-13 07:00:00|
|2000-03-19 07:30:00|
|2000-08-04 12:55:00|
|2002-09-30 18:11:00|
|2002-09-30 18:11:00|
|2002-05-06 09:22:00|
|2002-01-16 13:15:00|
|2003-01-08 15:46:00|
|2003-02-04 10:25:00|
|2003-02-04 10:25:00|
+-------------------+
Run Code Online (Sandbox Code Playgroud)

当我运行代码时,它抛出一条错误消息:

TypeError: 'DataFrame' object does not support item assignment
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<command-2244924718685919> in <module>
----> 1 df['INTERRUPTION_DATE']=df['INTERRUPTION_TIME'].dt.date

TypeError: 'DataFrame' object does not support item assignment
Run Code Online (Sandbox Code Playgroud)

我们可以在数据框的数据框中创建新列吗?我们如何在Azure数据块的数据框架上创建新列?

Mar*_*ari 5

这应该有效

from pyspark.sql.types import DateType


df2 = df.withColumn('INTERRUPTION_DATE', ,df['INTERRUPTION_TIME'].cast(DateType()))
Run Code Online (Sandbox Code Playgroud)

评论后编辑:

from pyspark.sql.functions import date_format

df.select(date_format('INTERRUPTION_TIME', 'M/d/yyyy').alias('INTERRUPTION_DATE'),
          date_format('INTERRUPTION_TIME', 'h:m:s a').alias('TIME'))
Run Code Online (Sandbox Code Playgroud)