相关疑难解决方法(0)

动态(基于列)间隔

如何向NOW添加动态(基于列)天数?

SELECT NOW() + INTERVAL a.number_of_days "DAYS" AS "The Future Date" 
FROM a;
Run Code Online (Sandbox Code Playgroud)

a.number_of_days整数在哪里?

postgresql intervals

53
推荐指数
5
解决办法
2万
查看次数

通过以字符串格式减去两个日期时间列来计算持续时间

我有一个Spark Dataframe,其中包含一系列日期:

from pyspark.sql import SQLContext
from pyspark.sql import Row
from pyspark.sql.types import *
sqlContext = SQLContext(sc)
import pandas as pd

rdd = sc.parallelizesc.parallelize([('X01','2014-02-13T12:36:14.899','2014-02-13T12:31:56.876','sip:4534454450'),
                                    ('X02','2014-02-13T12:35:37.405','2014-02-13T12:32:13.321','sip:6413445440'),
                                    ('X03','2014-02-13T12:36:03.825','2014-02-13T12:32:15.229','sip:4534437492'),
                                    ('XO4','2014-02-13T12:37:05.460','2014-02-13T12:32:36.881','sip:6474454453'),
                                    ('XO5','2014-02-13T12:36:52.721','2014-02-13T12:33:30.323','sip:8874458555')])
schema = StructType([StructField('ID', StringType(), True),
                     StructField('EndDateTime', StringType(), True),
                     StructField('StartDateTime', StringType(), True)])
df = sqlContext.createDataFrame(rdd, schema)
Run Code Online (Sandbox Code Playgroud)

我想做的是duration通过减去EndDateTime和找到StartDateTime.我想我会尝试使用函数执行此操作:

# Function to calculate time delta
def time_delta(y,x): 
    end = pd.to_datetime(y)
    start = pd.to_datetime(x)
    delta = (end-start)
    return delta

# create new RDD and add new column 'Duration' by applying …
Run Code Online (Sandbox Code Playgroud)

apache-spark apache-spark-sql pyspark

28
推荐指数
3
解决办法
5万
查看次数