相关疑难解决方法(0)

将字典列表转换为pandas DataFrame

我有一个这样的词典列表:

[{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]
Run Code Online (Sandbox Code Playgroud)

我想把它变成DataFrame像这样的熊猫:

      month  points  points_h1  time  year
0       NaN      50        NaN  5:00  2010
1  february      25        NaN  6:00   NaN
2   january      90        NaN  9:00   NaN
3      june     NaN         20   NaN   NaN
Run Code Online (Sandbox Code Playgroud)

注意:列的顺序无关紧要.

如何将字典列表转换为pandas DataFrame,如上所示?

python dictionary dataframe pandas

550
推荐指数
4
解决办法
25万
查看次数

如何将一些列作为json展平pandas数据帧?

我有一个df从数据库加载数据的数据框.大多数列都是json字符串,而有些列甚至是jsons列表.例如:

id     name     columnA                               columnB
1     John     {"dist": "600", "time": "0:12.10"}    [{"pos": "1st", "value": "500"},{"pos": "2nd", "value": "300"},{"pos": "3rd", "value": "200"}, {"pos": "total", "value": "1000"}]
2     Mike     {"dist": "600"}                       [{"pos": "1st", "value": "500"},{"pos": "2nd", "value": "300"},{"pos": "total", "value": "800"}]
...
Run Code Online (Sandbox Code Playgroud)

如您所见,并非所有行在列的json字符串中具有相同数量的元素.

我需要做的是保持正常的列像它一样,idname像这样平整json列:

id    name   columnA.dist   columnA.time   columnB.pos.1st   columnB.pos.2nd   columnB.pos.3rd     columnB.pos.total
1     John   600            0:12.10        500               300               200                 1000 
2     Mark   600            NaN            500               300               Nan                 800 
Run Code Online (Sandbox Code Playgroud)

我试过这样使用json_normalize:

from pandas.io.json import json_normalize
json_normalize(df) …
Run Code Online (Sandbox Code Playgroud)

python json flatten dataframe pandas

23
推荐指数
4
解决办法
2万
查看次数

在Pandas数据帧中提取嵌入为字符串的嵌套JSON

我有一个CSV,其中一个字段是嵌套的JSON对象,存储为字符串.我想将CSV加载到数据帧中,并将JSON解析为附加到原始数据帧的一组字段; 换句话说,提取JSON的内容并使它们成为数据帧的一部分.

我的CSV:

id|dist|json_request
1|67|{"loc":{"lat":45.7, "lon":38.9},"arrival": "Monday", "characteristics":{"body":{"color":"red", "make":"sedan"}, "manuf_year":2014}}
2|34|{"loc":{"lat":46.89, "lon":36.7},"arrival": "Tuesday", "characteristics":{"body":{"color":"blue", "make":"sedan"}, "manuf_year":2014}}
3|98|{"loc":{"lat":45.70, "lon":31.0}, "characteristics":{"body":{"color":"yellow"}, "manuf_year":2010}}
Run Code Online (Sandbox Code Playgroud)

请注意,并非所有行的所有键都相同.我希望它能产生一个与此相当的数据框:

data = {'id'     : [1, 2, 3],
        'dist'  : [67, 34, 98],
        'loc_lat': [45.7, 46.89, 45.70],
        'loc_lon': [38.9, 36.7, 31.0],
        'arrival': ["Monday", "Tuesday", "NA"],
        'characteristics_body_color':["red", "blue", "yellow"],
        'characteristics_body_make':["sedan", "sedan", "NA"],
        'characteristics_manuf_year':[2014, 2014, 2010]}
df = pd.DataFrame(data)
Run Code Online (Sandbox Code Playgroud)

(我很抱歉,我不能让桌子本身看起来很明智!请不要生我的气,我是菜鸟:()

我试过的

在经历了很多困难之后,我提出了以下解决方案:

#Import data
df_raw = pd.read_csv("sample.csv", delimiter="|")

#Parsing function
def parse_request(s):
    sj = json.loads(s)
    norm = json_normalize(sj)
    return norm

#Create an …
Run Code Online (Sandbox Code Playgroud)

python csv json

9
推荐指数
1
解决办法
6133
查看次数

如何在熊猫中读取和规范化以下 json?

我已经看到很多使用 pandas 在 stackoverflow 中读取 json 的问题,但我仍然无法解决这个简单的问题。

数据

{"session_id":{"0":["X061RFWB06K9V"],"1":["5AZ2X2A9BHH5U"]},"unix_timestamp":{"0":[1442503708],"1":[1441353991]},"cities":{"0":["New York NY, Newark NJ"],"1":["New York NY, Jersey City NJ, Philadelphia PA"]},"user":{"0":[[{"user_id":2024,"joining_date":"2015-03-22","country":"UK"}]],"1":[[{"user_id":2853,"joining_date":"2015-03-28","country":"DE"}]]}}
Run Code Online (Sandbox Code Playgroud)

我的尝试

import numpy as np
import pandas as pd
import json
from pandas.io.json import json_normalize

# attempt1
df = pd.read_json('a.json')

# attempt2
with open('a.json') as fi:
    data = json.load(fi)
    df = json_normalize(data,record_path='user',meta=['session_id','unix_timestamp','cities'])

Both of them do not give me the required output.

Run Code Online (Sandbox Code Playgroud)

所需的输出

      session_id unix_timestamp       cities  user_id joining_date country 
0  X061RFWB06K9V     1442503708  New York NY     2024   2015-03-22      UK   
0  X061RFWB06K9V …
Run Code Online (Sandbox Code Playgroud)

python json pandas

4
推荐指数
1
解决办法
3203
查看次数

如何从web scrape创建pandas数据框?

我想使用这个web scrape创建一个pandas数据框,这样我可以将数据导出到excel.有人熟悉这个吗?我在网上和网站上看到了不同的方法,但是无法通过这种方法成功复制结果.

这是迄今为止的代码:

import requests

source = requests.get("https://api.lineups.com/nba/fetch/lineups/gateway").json()

for team in source['data']:
    print("\n%s players\n" % team['home_route'].capitalize())
    for player in team['home_players']:
        print(player['name'])
    print("\n%s players\n" % team['away_route'].capitalize())
    for player in team['away_players']:
        print(player['name'])
Run Code Online (Sandbox Code Playgroud)

这个网站似乎很有用,但示例不同:

https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm

这是stackoverflow.com的另一个例子:

将Web抓取结果加载到Pandas DataFrame中

我是编码/抓取的新手,所以任何帮助都会非常感激.提前感谢您的时间和精力!

python json dataframe web-scraping pandas

2
推荐指数
1
解决办法
901
查看次数

标签 统计

python ×5

json ×4

pandas ×4

dataframe ×3

csv ×1

dictionary ×1

flatten ×1

web-scraping ×1