我有一个这样的词典列表:
[{'points': 50, 'time': '5:00', 'year': 2010},
{'points': 25, 'time': '6:00', 'month': "february"},
{'points':90, 'time': '9:00', 'month': 'january'},
{'points_h1':20, 'month': 'june'}]
Run Code Online (Sandbox Code Playgroud)
我想把它变成DataFrame像这样的熊猫:
month points points_h1 time year
0 NaN 50 NaN 5:00 2010
1 february 25 NaN 6:00 NaN
2 january 90 NaN 9:00 NaN
3 june NaN 20 NaN NaN
Run Code Online (Sandbox Code Playgroud)
注意:列的顺序无关紧要.
如何将字典列表转换为pandas DataFrame,如上所示?
我有一个df从数据库加载数据的数据框.大多数列都是json字符串,而有些列甚至是jsons列表.例如:
id name columnA columnB
1 John {"dist": "600", "time": "0:12.10"} [{"pos": "1st", "value": "500"},{"pos": "2nd", "value": "300"},{"pos": "3rd", "value": "200"}, {"pos": "total", "value": "1000"}]
2 Mike {"dist": "600"} [{"pos": "1st", "value": "500"},{"pos": "2nd", "value": "300"},{"pos": "total", "value": "800"}]
...
Run Code Online (Sandbox Code Playgroud)
如您所见,并非所有行在列的json字符串中具有相同数量的元素.
我需要做的是保持正常的列像它一样,id并name像这样平整json列:
id name columnA.dist columnA.time columnB.pos.1st columnB.pos.2nd columnB.pos.3rd columnB.pos.total
1 John 600 0:12.10 500 300 200 1000
2 Mark 600 NaN 500 300 Nan 800
Run Code Online (Sandbox Code Playgroud)
我试过这样使用json_normalize:
from pandas.io.json import json_normalize
json_normalize(df) …Run Code Online (Sandbox Code Playgroud) 我有一个CSV,其中一个字段是嵌套的JSON对象,存储为字符串.我想将CSV加载到数据帧中,并将JSON解析为附加到原始数据帧的一组字段; 换句话说,提取JSON的内容并使它们成为数据帧的一部分.
我的CSV:
id|dist|json_request
1|67|{"loc":{"lat":45.7, "lon":38.9},"arrival": "Monday", "characteristics":{"body":{"color":"red", "make":"sedan"}, "manuf_year":2014}}
2|34|{"loc":{"lat":46.89, "lon":36.7},"arrival": "Tuesday", "characteristics":{"body":{"color":"blue", "make":"sedan"}, "manuf_year":2014}}
3|98|{"loc":{"lat":45.70, "lon":31.0}, "characteristics":{"body":{"color":"yellow"}, "manuf_year":2010}}
Run Code Online (Sandbox Code Playgroud)
请注意,并非所有行的所有键都相同.我希望它能产生一个与此相当的数据框:
data = {'id' : [1, 2, 3],
'dist' : [67, 34, 98],
'loc_lat': [45.7, 46.89, 45.70],
'loc_lon': [38.9, 36.7, 31.0],
'arrival': ["Monday", "Tuesday", "NA"],
'characteristics_body_color':["red", "blue", "yellow"],
'characteristics_body_make':["sedan", "sedan", "NA"],
'characteristics_manuf_year':[2014, 2014, 2010]}
df = pd.DataFrame(data)
Run Code Online (Sandbox Code Playgroud)
(我很抱歉,我不能让桌子本身看起来很明智!请不要生我的气,我是菜鸟:()
在经历了很多困难之后,我提出了以下解决方案:
#Import data
df_raw = pd.read_csv("sample.csv", delimiter="|")
#Parsing function
def parse_request(s):
sj = json.loads(s)
norm = json_normalize(sj)
return norm
#Create an …Run Code Online (Sandbox Code Playgroud) 我已经看到很多使用 pandas 在 stackoverflow 中读取 json 的问题,但我仍然无法解决这个简单的问题。
{"session_id":{"0":["X061RFWB06K9V"],"1":["5AZ2X2A9BHH5U"]},"unix_timestamp":{"0":[1442503708],"1":[1441353991]},"cities":{"0":["New York NY, Newark NJ"],"1":["New York NY, Jersey City NJ, Philadelphia PA"]},"user":{"0":[[{"user_id":2024,"joining_date":"2015-03-22","country":"UK"}]],"1":[[{"user_id":2853,"joining_date":"2015-03-28","country":"DE"}]]}}
Run Code Online (Sandbox Code Playgroud)
import numpy as np
import pandas as pd
import json
from pandas.io.json import json_normalize
# attempt1
df = pd.read_json('a.json')
# attempt2
with open('a.json') as fi:
data = json.load(fi)
df = json_normalize(data,record_path='user',meta=['session_id','unix_timestamp','cities'])
Both of them do not give me the required output.
Run Code Online (Sandbox Code Playgroud)
session_id unix_timestamp cities user_id joining_date country
0 X061RFWB06K9V 1442503708 New York NY 2024 2015-03-22 UK
0 X061RFWB06K9V …Run Code Online (Sandbox Code Playgroud) 我想使用这个web scrape创建一个pandas数据框,这样我可以将数据导出到excel.有人熟悉这个吗?我在网上和网站上看到了不同的方法,但是无法通过这种方法成功复制结果.
这是迄今为止的代码:
import requests
source = requests.get("https://api.lineups.com/nba/fetch/lineups/gateway").json()
for team in source['data']:
print("\n%s players\n" % team['home_route'].capitalize())
for player in team['home_players']:
print(player['name'])
print("\n%s players\n" % team['away_route'].capitalize())
for player in team['away_players']:
print(player['name'])
Run Code Online (Sandbox Code Playgroud)
这个网站似乎很有用,但示例不同:
https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm
这是stackoverflow.com的另一个例子:
我是编码/抓取的新手,所以任何帮助都会非常感激.提前感谢您的时间和精力!