网页抓取futbin.com

Mar*_*cus 1 python json beautifulsoup web-scraping

我正在尝试从futbin.com收集包含FIFA终极队队员时间序列数据的数据集。我在GitHub https://github.com/darkyin87/futbin-scraper上找到了一个脚本,该脚本 能够在给定玩家/ id的情况下,抓取玩家的当前价格:

import requests  
import json  

domain = 'https://www.futbin.com'  
version = 19  
page = 'playerPrices'  

player_ids = {  
  'Arturo Vidal': 181872,  
  'Pierre-Emerick Aubameyang': 188567,  
  'Robert Lewandowski': 188545,  
  'Jerome Boateng': 183907,  
  'Sergio Ramos': 155862,  
  'Antoine Griezmann': 194765,  
  'David Alaba': 197445,  
  'Paulo Dybala': 211110,  
  'Radja Nainggolan': 178518  
}

def fetch_prices():  
 ret_val = {}  
  for name, id in player_ids.iteritems():  
    url = "%s/%s/%s?player=%s" % (domain, version, page, id)  
    response = requests.get(url)  
    data = response.json()  
    ret_val[name] = data[str(id)]['prices']['ps']['LCPrice']  
  return ret_val  

if __name__ == "__main__":  
  prices = fetch_prices()  

fetch_prices  
Run Code Online (Sandbox Code Playgroud)

但是我要查找的信息不是当前价格,而是价格(特别是PS价格)历史记录,该历史记录位于我绘制的底部。 在此处输入图片说明 https://www.futbin.com/19/player/143/Cristiano%20Ronaldo/

我尝试了一些操作,但似乎无法解析/提取此信息...有人可以帮我还是给我提示?提前致谢

Sel*_*çuk 6

这样很难获得数据。如果您检查浏览器网络工具,则可以看到创建图表的数据来自http请求。当然不要滥用它。

import requests
from datetime import datetime

player_ids = {  
  'Arturo Vidal': 181872,  
  'Pierre-Emerick Aubameyang': 188567,  
  'Robert Lewandowski': 188545,  
  'Jerome Boateng': 183907,  
  'Sergio Ramos': 155862,  
  'Antoine Griezmann': 194765,  
  'David Alaba': 197445,  
  'Paulo Dybala': 211110,  
  'Radja Nainggolan': 178518  
}

for (name,id) in player_ids.items():
    r = requests.get('https://www.futbin.com/19/playerGraph?type=daily_graph&year=19&player={0}'.format(id))
    data = r.json()

    print(name)   
    print("-"*20)
    #Change ps to xbox or pc to get other prices
    for price in data['ps']:
        #There is extra zeroes in response.
        date = datetime.utcfromtimestamp(price[0] / 1000).strftime('%Y-%m-%d')
        price = price[1]
        print(date,price)
Run Code Online (Sandbox Code Playgroud)

这会给你

Arturo Vidal
--------------------
2018-09-21 8450
2018-09-22 9318
2018-09-23 10820
2018-09-24 13288
2018-09-25 13346
2018-09-26 17235
2018-09-27 19092
2018-09-28 15960
2018-09-29 14283
2018-09-30 14967
2018-10-01 15380
2018-10-02 15367
2018-10-03 13192
Pierre-Emerick Aubameyang
--------------------
2018-09-21 136000
2018-09-22 160673
2018-09-23 205474
2018-09-24 216344
2018-09-25 244750
2018-09-26 277007
2018-09-27 288659
2018-09-28 259007
2018-09-29 261799
2018-09-30 270771
2018-10-01 274245
2018-10-02 281057
2018-10-03 275606
Robert Lewandowski
--------------------
2018-09-21 73000
2018-09-22 79961
2018-09-23 94827
2018-09-24 117893
2018-09-25 125310
2018-09-26 144630
2018-09-27 159224
2018-09-28 135122
2018-09-29 132696
2018-09-30 137728
2018-10-01 143130
2018-10-02 150968
2018-10-03 144250
Run Code Online (Sandbox Code Playgroud)

而这样的例子不胜枚举。