我正在使用pandas以python方式读取csv文件
data = pd.read_csv('file1.csv', error_bad_lines=False)
Run Code Online (Sandbox Code Playgroud)
我正在跳过第6行:预计4个领域,看到6个
如何阻止此警告出现?谢谢
我正在尝试在pyspark中拆分数据帧这是我拥有的数据
df = sc.parallelize([[1, 'Foo|10'], [2, 'Bar|11'], [3,'Car|12']]).toDF(['Key', 'Value'])
df = df.withColumn('Splitted', split(df['Value'], '|')[0])
Run Code Online (Sandbox Code Playgroud)
我有
+-----+---------+-----+
|Key|Value|Splitted |
+-----+---------+-----+
| 1| Food|10| F|
| 2| Bar|11 | B|
| 3| Caring 12| C|
+-----+---------+-----+
Run Code Online (Sandbox Code Playgroud)
但我想要
+-----+---------+-----+
|Key | Value|Splitted|
+-----+---------+-----+
| 1| 10| Food |
| 2| 11| Bar |
| 3| 12|Caring |
+-----+---------+-----+
Run Code Online (Sandbox Code Playgroud)
有人可以指出我做错了什么吗?
What if i have a unique situation like this?
df = sc.parallelize([[1, 'Foo|10|we'], [2, 'Bar|11|we'], [3,'Car|12|we']]).toDF(['Key', 'Value'])
+---+---------+
|Key| Value|
+---+---------+
| 1|Foo|10|we|
| …Run Code Online (Sandbox Code Playgroud) 我有一个json文件,我试图访问该值,但我一直收到一个错误,上面写着"TypeError:string indices必须是整数,而不是str"
这是Json数据.
{'sentiment': '{\n "0": {\n "comment": "Chibok schoolgirls were swapped for 5 Boko Haram commanders \n "username": "@NigeriaNewsdesk:, @todayng", \n "score": 0.0\n }\n}'}
Run Code Online (Sandbox Code Playgroud)
data = val ['sentiment']打印数据将此返回给我
{
"0": {
"comment": "Chibok schoolgirls were swapped for 5 Boko Haram commanders",
"username": "@NigeriaNewsdesk:, @todayng",
"score": 0.0
}
}
Run Code Online (Sandbox Code Playgroud)
但是当我尝试访问键/值对时,我得到数据中的记录错误:print(records ["0"] ["username"])
TypeError: string indices must be integers, not str
Run Code Online (Sandbox Code Playgroud)
知道我为什么会收到这些错误吗?谢谢