将 JSON 数组嵌套到 Python Pandas DataFrame

Sti*_*ngo 5 python json dictionary pandas json-normalize

我正在尝试扩展 pandas 数据框中的嵌套 json 数组。

\n

这就是我的 JSON:

\n
[ {\n        "id": "0001",\n        "name": "Stiven",\n        "location": [{\n                "country": "Colombia",\n                "department": "Choc\xc3\xb3",\n                "city": "Quibd\xc3\xb3"\n            }, {\n                "country": "Colombia",\n                "department": "Antioquia",\n                "city": "Medellin"\n            }, {\n                "country": "Colombia",\n                "department": "Cundinamarca",\n                "city": "Bogot\xc3\xa1"\n            }\n        ]\n    }, {\n        "id": "0002",\n        "name": "Jhon Jaime",\n        "location": [{\n                "country": "Colombia",\n                "department": "Valle del Cauca",\n                "city": "Cali"\n            }, {\n                "country": "Colombia",\n                "department": "Putumayo",\n                "city": "Mocoa"\n            }, {\n                "country": "Colombia",\n                "department": "Arauca",\n                "city": "Arauca"\n            }\n        ]\n    }, {\n        "id": "0003",\n        "name": "Francisco",\n        "location": [{\n                "country": "Colombia",\n                "department": "Atl\xc3\xa1ntico",\n                "city": "Barranquilla"\n            }, {\n                "country": "Colombia",\n                "department": "Bol\xc3\xadvar",\n                "city": "Cartagena"\n            }, {\n                "country": "Colombia",\n                "department": "La Guajira",\n                "city": "Riohacha"\n            }\n        ]\n    }\n]\n
Run Code Online (Sandbox Code Playgroud)\n

这就是我的数据框:

\n
index   id    name         location\n0       0001  Stiven       [{\'country\':\'Colombia\', \'department\': \'Choc\xc3\xb3\', \'city\': \'Quibd\xc3\xb3\'}, {\'country\':\'Colombia\', \'department\': \'Antioquia\', \'city\': \'Medellin\'}, {\'country\':\'Colombia\', \'department\': \'Cundinamarca\', \'city\': \'Bogot\xc3\xa1\'}]\n1       0002  Jhon Jaime   [{\'country\':\'Colombia\', \'department\': \'Valle del Cauca\', \'city\': \'Cali\'}, {\'country\':\'Colombia\', \'department\': \'Putumayo\', \'city\': \'Mocoa\'}, {\'country\':\'Colombia\', \'department\': \'Arauca\', \'city\': \'Arauca\'}]\n2       0003  Francisco    [{\'country\':\'Colombia\', \'department\': \'Atl\xc3\xa1ntico\', \'city\': \'Barranquilla\'}, {\'country\':\'Colombia\', \'department\': \'Bol\xc3\xadvar\', \'city\': \'Cartagena\'}, {\'country\':\'Colombia\', \'department\': \'La Guajira\', \'city\': \'Riohacha\'}] \n
Run Code Online (Sandbox Code Playgroud)\n

我需要将每个 id 转换为数据帧,如下所示:

\n
index   id    name         country   department       city\n0       0001  Stiven       Colombia  Choc\xc3\xb3            Quibd\xc3\xb3\n1       0001  Stiven       Colombia  Antioquia        Medellin\n2       0001  Stiven       Colombia  Cundinamarca     Bogot\xc3\xa1\n3       0002  Jhon Jaime   Colombia  Valle del Cauca  Cali\n4       0002  Jhon Jaime   Colombia  Putumayo         Mocoa\n5       0002  Jhon Jaime   Colombia  Arauca           Arauca\n6       0003  Francisco    Colombia  Atl\xc3\xa1ntico        Barranquilla\n7       0003  Francisco    Colombia  Bol\xc3\xadvar          Cartagena \n8       0003  Francisco    Colombia  La Guajira       Riohacha   \n
Run Code Online (Sandbox Code Playgroud)\n

Tre*_*ney 7

    \n
  • 如果JSON是从文件加载,请使用json.loads,但如果是JSON直接从 API 加载,则可能没有必要。
  • \n
  • pandas.json_normalize与参数一起使用meta,将其转换JSON为 DataFrame。
  • \n
\n
import pandas as pd\nfrom pathlib import Path\nimport json\n\n# path to file\np = Path(r\'c:\\path_to_file\\test.json\')\n\n# read json\nwith p.open(\'r\', encoding=\'utf-8\') as f:\n    data = json.loads(f.read())\n\n# create dataframe\ndf = pd.json_normalize(data, record_path=\'location\', meta=[\'id\', \'name\'])\n\n# output\n  country       department          city    id        name\n Colombia            Choc\xc3\xb3        Quibd\xc3\xb3  0001      Stiven\n Colombia        Antioquia      Medellin  0001      Stiven\n Colombia     Cundinamarca        Bogot\xc3\xa1  0001      Stiven\n Colombia  Valle del Cauca          Cali  0002  Jhon Jaime\n Colombia         Putumayo         Mocoa  0002  Jhon Jaime\n Colombia           Arauca        Arauca  0002  Jhon Jaime\n Colombia        Atl\xc3\xa1ntico  Barranquilla  0003   Francisco\n Colombia          Bol\xc3\xadvar     Cartagena  0003   Francisco\n Colombia       La Guajira      Riohacha  0003   Francisco\n
Run Code Online (Sandbox Code Playgroud)\n