bob*_*o32 2 timestamp scala apache-spark apache-spark-sql
我通过Spark将一些日志文件放入sql表中,我的架构如下所示:
|-- timestamp: timestamp (nullable = true)
|-- c_ip: string (nullable = true)
|-- cs_username: string (nullable = true)
|-- s_ip: string (nullable = true)
|-- s_port: string (nullable = true)
|-- cs_method: string (nullable = true)
|-- cs_uri_stem: string (nullable = true)
|-- cs_query: string (nullable = true)
|-- sc_status: integer (nullable = false)
|-- sc_bytes: integer (nullable = false)
|-- cs_bytes: integer (nullable = false)
|-- time_taken: integer (nullable = false)
|-- User_Agent: string (nullable = true)
|-- Referrer: string (nullable = true)
Run Code Online (Sandbox Code Playgroud)
正如您所注意到的,我创建了一个时间戳字段,我读取的内容由Spark支持(根据我的理解,日期不起作用).我很想用于"where timestamp>(2012-10-08 16:10:36.0)"这样的查询,但是当我运行它时,我一直都会遇到错误.我尝试了这两个以下sintax形式:对于第二个我解析一个字符串所以我确定我实际上以时间戳格式传递它.我使用2个函数:parse和 date2timestamp.
有关如何处理时间戳值的任何提示?
谢谢!
1)scala> sqlContext.sql("SELECT*FROM Logs as l where l.timestamp =(2012-10-08 16:10:36.0)").collect
java.lang.RuntimeException: [1.55] failure: ``)'' expected but 16 found
SELECT * FROM Logs as l where l.timestamp=(2012-10-08 16:10:36.0)
^
Run Code Online (Sandbox Code Playgroud)
2) sqlContext.sql("SELECT*FROM Logs as l where l.timestamp ="+ date2timestamp(formatTime3.parse("2012-10-08 16:10:36.0"))).收集
java.lang.RuntimeException: [1.54] failure: ``UNION'' expected but 16 found
SELECT * FROM Logs as l where l.timestamp=2012-10-08 16:10:36.0
^
Run Code Online (Sandbox Code Playgroud)
我认为问题首先是时间戳的精度,而且我传递的表示时间戳的字符串必须作为字符串传递
所以这个查询现在有效:
sqlContext.sql("SELECT * FROM Logs as l where cast(l.timestampLog as String) <= '2012-10-08 16:10:36'")
Run Code Online (Sandbox Code Playgroud)
你忘了引号。
尝试使用以下语法:
L.timestamp = '2012-07-16 00:00:00'
Run Code Online (Sandbox Code Playgroud)
或者,尝试
L.timestamp = CAST('2012-07-16 00:00:00' AS TIMESTAMP)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
9776 次 |
| 最近记录: |