如何将列从十六进制字符串转换为long?

use*_*657 2 scala apache-spark apache-spark-sql

我有一个Icao带有十六进制代码的列的DataFrame ,我想将其转换为Long数据类型。如何在Spark SQL中做到这一点?

|  Icao|count|
+------+-----+
|471F8D|81350|
|471F58|79634|
|471F56|79112|
|471F86|78177|
|471F8B|75300|
|47340D|75293|
|471F83|74864|
|471F57|73815|
|471F4A|72290|
|471F5F|72133|
|40612C|69676|
Run Code Online (Sandbox Code Playgroud)

Jac*_*ski 8

TL; DR使用转换标准功能。

CONV(NUM:柱,fromBase:中等,至基站的:int):柱从一个基站到另一个转换字符串中的列中的数字。

conv的解决方法可能如下:

scala> icao.show
+------+-----+
|  Icao|count|
+------+-----+
|471F8D|81350|
|471F58|79634|
|471F56|79112|
|471F86|78177|
|471F8B|75300|
|47340D|75293|
|471F83|74864|
|471F57|73815|
|471F4A|72290|
|471F5F|72133|
|40612C|69676|
+------+-----+

// conv is not available by default unless you're in spark-shell
import org.apache.spark.sql.functions.conv

val s1 = icao.withColumn("conv", conv($"Icao", 16, 10))
scala> s1.show
+------+-----+-------+
|  Icao|count|   conv|
+------+-----+-------+
|471F8D|81350|4661133|
|471F58|79634|4661080|
|471F56|79112|4661078|
|471F86|78177|4661126|
|471F8B|75300|4661131|
|47340D|75293|4666381|
|471F83|74864|4661123|
|471F57|73815|4661079|
|471F4A|72290|4661066|
|471F5F|72133|4661087|
|40612C|69676|4219180|
+------+-----+-------+
Run Code Online (Sandbox Code Playgroud)

conv 具有为您提供输入列类型的结果的功能,因此我从字符串开始并得到了字符串。

scala> s1.printSchema
root
 |-- Icao: string (nullable = true)
 |-- count: string (nullable = true)
 |-- conv: string (nullable = true)
Run Code Online (Sandbox Code Playgroud)

如果我使用整数,那么我将拥有整数。

您可以转换conv使用其他内置方法的结果cast(或从输入列的正确类型开始)。

val s2 = icao.withColumn("conv", conv($"Icao", 16, 10) cast "long")
scala> s2.printSchema
root
 |-- Icao: string (nullable = true)
 |-- count: string (nullable = true)
 |-- conv: long (nullable = true)

scala> s2.show
+------+-----+-------+
|  Icao|count|   conv|
+------+-----+-------+
|471F8D|81350|4661133|
|471F58|79634|4661080|
|471F56|79112|4661078|
|471F86|78177|4661126|
|471F8B|75300|4661131|
|47340D|75293|4666381|
|471F83|74864|4661123|
|471F57|73815|4661079|
|471F4A|72290|4661066|
|471F5F|72133|4661087|
|40612C|69676|4219180|
+------+-----+-------+
Run Code Online (Sandbox Code Playgroud)

  • 这应该是公认的答案。内置函数比UDF更好。 (4认同)