小编Kar*_*wal的帖子

Pyspark 错误：- dataType <class 'pyspark.sql.types.StringType'> 应该是 <class 'pyspark.sql.types.DataType'> 的一个实例

我需要从 pipelinedRDD 中提取一些数据，但是在将其转换为 Dataframe 时，出现以下错误：

Traceback (most recent call last):

  File "/home/karan/Desktop/meds.py", line 42, in <module>

    relevantToSymEntered(newrdd)

  File "/home/karan/Desktop/meds.py", line 26, in relevantToSymEntered

    mat = spark.createDataFrame(self,StructType([StructField("Prescribed 

medicine",StringType), StructField(["Disease","ID","Symptoms 

Recorded","Severeness"],ArrayType)]))

  File "/home/karan/Downloads/spark-2.4.2-bin-

hadoop2.7/python/pyspark/sql/types.py", line 409, in __init__

    "dataType %s should be an instance of %s" % (dataType, DataType)

AssertionError: dataType <class 'pyspark.sql.types.StringType'> should be an 
instance of <class 'pyspark.sql.types.DataType'>

Run Code Online (Sandbox Code Playgroud)

1. 我的错误是不同类型的，它是 TypeError 而我遇到了 AssertionError 的问题。

我的问题与数据类型的转换无关。

我已经尝试过使用 toDF() 但它更改了不受欢迎的列名。

Traceback (most recent call last):

  File "/home/karan/Desktop/meds.py", line 42, in <module>

    relevantToSymEntered(newrdd) …

Run Code Online (Sandbox Code Playgroud)

python apache-spark apache-spark-sql pyspark

Kar*_*wal

2019 05-07

4
推荐指数

1
解决办法

9807
查看次数

标签统计

apache-spark ×1

apache-spark-sql ×1

pyspark ×1

python ×1

Pyspark 错误：- dataType &lt;class 'pyspark.sql.types.StringType'&gt; 应该是 &lt;class 'pyspark.sql.types.DataType'&gt; 的一个实例

标签 统计

小编Kar_wal的帖子

Pyspark 错误：- dataType <class 'pyspark.sql.types.StringType'> 应该是 <class 'pyspark.sql.types.DataType'> 的一个实例

标签统计