Nav*_*nth 6 python datetime pyspark
我正在使用spark 2.1.0.我无法在pyspark中创建时间戳列我正在使用下面的代码片段.请帮忙
df=df.withColumn('Age',lit(datetime.now()))
Run Code Online (Sandbox Code Playgroud)
我正进入(状态
断言错误:col应该是Column
请帮忙
Ank*_*ngh 11
假设您的代码段中包含数据框,并且您希望所有行都有相同的时间戳.
让我创建一些虚拟数据帧.
>>> dict = [{'name': 'Alice', 'age': 1},{'name': 'Again', 'age': 2}]
>>> df = spark.createDataFrame(dict)
>>> import time
>>> import datetime
>>> timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d %H:%M:%S')
>>> type(timestamp)
<class 'str'>
>>> from pyspark.sql.functions import lit,unix_timestamp
>>> timestamp
'2017-08-02 16:16:14'
>>> new_df = df.withColumn('time',unix_timestamp(lit(timestamp),'yyyy-MM-dd HH:mm:ss').cast("timestamp"))
>>> new_df.show(truncate = False)
+---+-----+---------------------+
|age|name |time |
+---+-----+---------------------+
|1 |Alice|2017-08-02 16:16:14.0|
|2 |Again|2017-08-02 16:16:14.0|
+---+-----+---------------------+
>>> new_df.printSchema()
root
|-- age: long (nullable = true)
|-- name: string (nullable = true)
|-- time: timestamp (nullable = true)
Run Code Online (Sandbox Code Playgroud)
bal*_*ika 10
I am not sure for 2.1.0, on 2.2.1 at least you can just:
from pyspark.sql import functions as F
df.withColumn('Age', F.current_timestamp())
Run Code Online (Sandbox Code Playgroud)
Hope it helps!
| 归档时间: |
|
| 查看次数: |
14332 次 |
| 最近记录: |