Pyspark 错误:- dataType <class 'pyspark.sql.types.StringType'> 应该是 <class 'pyspark.sql.types.DataType'> 的一个实例

Kar*_*wal 4 python apache-spark apache-spark-sql pyspark

我需要从 pipelinedRDD 中提取一些数据,但是在将其转换为 Dataframe 时,出现以下错误:

Traceback (most recent call last):

  File "/home/karan/Desktop/meds.py", line 42, in <module>

    relevantToSymEntered(newrdd)

  File "/home/karan/Desktop/meds.py", line 26, in relevantToSymEntered

    mat = spark.createDataFrame(self,StructType([StructField("Prescribed 

medicine",StringType), StructField(["Disease","ID","Symptoms 

Recorded","Severeness"],ArrayType)]))

  File "/home/karan/Downloads/spark-2.4.2-bin-

hadoop2.7/python/pyspark/sql/types.py", line 409, in __init__

    "dataType %s should be an instance of %s" % (dataType, DataType)

AssertionError: dataType <class 'pyspark.sql.types.StringType'> should be an 
instance of <class 'pyspark.sql.types.DataType'>
Run Code Online (Sandbox Code Playgroud)

1. 我的错误是不同类型的,它是 TypeError 而我遇到了 AssertionError 的问题。

  1. 我的问题与数据类型的转换无关。

我已经尝试过使用 toDF() 但它更改了不受欢迎的列名。

Traceback (most recent call last):

  File "/home/karan/Desktop/meds.py", line 42, in <module>

    relevantToSymEntered(newrdd)

  File "/home/karan/Desktop/meds.py", line 26, in relevantToSymEntered

    mat = spark.createDataFrame(self,StructType([StructField("Prescribed 

medicine",StringType), StructField(["Disease","ID","Symptoms 

Recorded","Severeness"],ArrayType)]))

  File "/home/karan/Downloads/spark-2.4.2-bin-

hadoop2.7/python/pyspark/sql/types.py", line 409, in __init__

    "dataType %s should be an instance of %s" % (dataType, DataType)

AssertionError: dataType <class 'pyspark.sql.types.StringType'> should be an 
instance of <class 'pyspark.sql.types.DataType'>
Run Code Online (Sandbox Code Playgroud)

小智 10

StructType([StructField("Prescribed medicine",StringType), StructField(["Disease","ID","Symptoms Recorded","Severeness"],ArrayType)])

替换为:

StructType([StructField("Prescribed medicine",StringType()), StructField(["Disease","ID","Symptoms Recorded","Severeness"],ArrayType())]).

您需要实例化该类。