创建Dataframe时出现DecimalType问题

Bha*_*h K 2 scala dataframe apache-spark

当我尝试使用十进制类型创建数据框时,它会给我下面的错误.

我正在执行以下步骤:

import org.apache.spark.sql.Row;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;
import org.apache.spark.sql.types.StringType;
import org.apache.spark.sql.types.DataTypes._;


//created a DecimalType
val DecimalType = DataTypes.createDecimalType(15,10)
Run Code Online (Sandbox Code Playgroud)

//创建了一个模式

val sch = StructType(StructField("COL1",StringType,true)::StructField("COL2",**DecimalType**,true)::Nil)

val src = sc.textFile("test_file.txt")
val row = src.map(x=>x.split(",")).map(x=>Row.fromSeq(x))
val df1= sqlContext.createDataFrame(row,sch)
Run Code Online (Sandbox Code Playgroud)

df1创建时没有任何错误.但是,当我发出df1.collect()动作时,它给出了以下错误:

scala.MatchError: 0 (of class java.lang.String)
    at org.apache.spark.sql.catalyst.CatalystTypeConverters$DecimalConverter.toCatalystImpl(CatalystTypeConverters.scala:326)
Run Code Online (Sandbox Code Playgroud)

test_file.txt内容:

test1,0
test2,0.67
test3,10.65
test4,-10.1234567890
Run Code Online (Sandbox Code Playgroud)

我创建DecimalType的方式有什么问题吗?

cst*_*ur4 9

你应该有一个BigDecimal转换为的实例DecimalType.

val DecimalType = DataTypes.createDecimalType(15, 10)
val sch = StructType(StructField("COL1", StringType, true) :: StructField("COL2", DecimalType, true) :: Nil)

val src = sc.textFile("test_file.txt")
val row = src.map(x => x.split(",")).map(x => Row(x(0), BigDecimal.decimal(x(1).toDouble)))

val df1 = spark.createDataFrame(row, sch)
df1.collect().foreach { println }
df1.printSchema()
Run Code Online (Sandbox Code Playgroud)

结果如下:

[test1,0E-10]
[test2,0.6700000000]
[test3,10.6500000000]
[test4,-10.1234567890]
root
 |-- COL1: string (nullable = true)
 |-- COL2: decimal(15,10) (nullable = true)
Run Code Online (Sandbox Code Playgroud)