Mal*_*ath 6 python python-3.x pandas json-normalize
我需要做一个python脚本来
person_id, name, flag)的 csv 文件。该文件有 3000 行。person_id来自 csv 文件,我需要调用一个 URL 传递person_idGET
http://api.myendpoint.intranet/get-data/1234
该 URL 将返回 的一些信息person_id,如下例所示。我需要获取所有租金对象并保存在我的 csv 中。我的输出需要是这样的import pandas as pd
import requests
ids = pd.read_csv(f"{path}/data.csv", delimiter=';')
person_rents = df = pd.DataFrame([], columns=list('person_id','carId','price','rentStatus'))
for id in ids:
response = request.get(f'endpoint/{id["person_id"]}')
json = response.json()
person_rents.append( [person_id, rent['carId'], rent['price'], rent['rentStatus'] ] )
pd.read_csv(f"{path}/data.csv", delimiter=';' )
Run Code Online (Sandbox Code Playgroud)
person_id;name;flag;cardId;price;rentStatus
1000;Joseph;1;6638;1000;active
1000;Joseph;1;5566;2000;active
Run Code Online (Sandbox Code Playgroud)
响应示例
{
"active": false,
"ctodx": false,
"rents": [{
"carId": 6638,
"price": 1000,
"rentStatus": "active"
}, {
"carId": 5566,
"price": 2000,
"rentStatus": "active"
}
],
"responseCode": "OK",
"status": [{
"request": 345,
"requestStatus": "F"
}, {
"requestId": 678,
"requestStatus": "P"
}
],
"transaction": false
}
Run Code Online (Sandbox Code Playgroud)
每次调用的返回将是这样的
{"mileage":1000.0000}
{"mileage":550.0000}
Run Code Online (Sandbox Code Playgroud)
最终输出必须是
person_id;name;flag;cardId;price;rentStatus;mileage
1000;Joseph;1;6638;1000;active;1000.0000
1000;Joseph;1;5566;2000;active;550.0000
Run Code Online (Sandbox Code Playgroud)
有人可以帮我写这个脚本吗?可以使用 pandas 或任何 python 3 lib。
df与pd.read_csv.
'person_id'都是唯一的。.applyon'person_id'来呼叫prepare_data。
prepare_data期望'person_id'是 astr或int,如类型注释所示,Union[int, str]API,这将返回dict, 给prepare_data函数。'rents'将的键转换dict为数据帧,使用pd.json_normalize。.applyon'carId'来调用 ,API并提取,将其作为列'mileage'添加到 dataframe 中。data'person_id'到data,可用于与df合并s。pd.Series使用、然后将 ,转换为数据s帧。pd.concatmerge dfsperson_idpd.to_csv以所需的形式保存到 csv 。call_api。call_api返回 a dict,就像问题中显示的响应一样,代码的其余部分将正确工作以产生所需的输出。import pandas as pd
import requests
import json
from typing import Union
def call_api(url: str) -> dict:
r = requests.get(url)
return r.json()
def prepare_data(uid: Union[int, str]) -> pd.DataFrame:
d_url = f'http://api.myendpoint.intranet/get-data/{uid}'
m_url = 'http://api.myendpoint.intranet/get-mileage/'
# get the rent data from the api call
rents = call_api(d_url)['rents']
# normalize rents into a dataframe
data = pd.json_normalize(rents)
# get the mileage data from the api call and add it to data as a column
data['mileage'] = data.carId.apply(lambda cid: call_api(f'{m_url}{cid}')['mileage'])
# add person_id as a column to data, which will be used to merge data to df
data['person_id'] = uid
return data
# read data from file
df = pd.read_csv('file.csv', sep=';')
# call prepare_data
s = df.person_id.apply(prepare_data)
# s is a Series of DataFrames, which can be combined with pd.concat
s = pd.concat([v for v in s])
# join df with s, on person_id
df = df.merge(s, on='person_id')
# save to csv
df.to_csv('output.csv', sep=';', index=False)
Run Code Online (Sandbox Code Playgroud)
TraceBack作为文本粘贴到代码块中。# given the following start dataframe
person_id name flag
0 1000 Joseph 1
1 400 Sam 1
# resulting dataframe using the same data for both id 1000 and 400
person_id name flag carId price rentStatus mileage
0 1000 Joseph 1 6638 1000 active 1000.0
1 1000 Joseph 1 5566 2000 active 1000.0
2 400 Sam 1 6638 1000 active 1000.0
3 400 Sam 1 5566 2000 active 1000.0
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1491 次 |
| 最近记录: |