我需要从 pipelinedRDD 中提取一些数据,但是在将其转换为 Dataframe 时,出现以下错误:
Traceback (most recent call last):
File "/home/karan/Desktop/meds.py", line 42, in <module>
relevantToSymEntered(newrdd)
File "/home/karan/Desktop/meds.py", line 26, in relevantToSymEntered
mat = spark.createDataFrame(self,StructType([StructField("Prescribed
medicine",StringType), StructField(["Disease","ID","Symptoms
Recorded","Severeness"],ArrayType)]))
File "/home/karan/Downloads/spark-2.4.2-bin-
hadoop2.7/python/pyspark/sql/types.py", line 409, in __init__
"dataType %s should be an instance of %s" % (dataType, DataType)
AssertionError: dataType <class 'pyspark.sql.types.StringType'> should be an
instance of <class 'pyspark.sql.types.DataType'>
Run Code Online (Sandbox Code Playgroud)
1. 我的错误是不同类型的,它是 TypeError 而我遇到了 AssertionError 的问题。
我已经尝试过使用 toDF() 但它更改了不受欢迎的列名。
Traceback (most recent call last):
File "/home/karan/Desktop/meds.py", line 42, in <module>
relevantToSymEntered(newrdd) …Run Code Online (Sandbox Code Playgroud)