如何使用 Beautifulsoup 获取这些 Json 代码?

Ste*_*ano 4 python

JSON

 <script>
 var data2sales= 
 [{
   "key": "Owners",
   "bar": true,
   "values": [
     [1490400000000, 1591, "", "", ""],
     [1490486400000, 1924, "#2B6A94", "", ""],
     [1490572800000, 1982, "", "", ""],
     [1490659200000, 1606, "", "", ""]]
 }]
 </script>
Run Code Online (Sandbox Code Playgroud)

我在 Python 中获取 Json 的代码

 notices = str(soup.select('script')[30])
 split_words=notices.split('var data2sales= ')
 split_words=split_words[1]
 temp=split_words[44:689]
 temp = 'var data2sales= {' +temp + '}'
 print(temp)
 newDict = json.loads((temp))
 print(newDict)
Run Code Online (Sandbox Code Playgroud)

我是 Python 中 BeautifulSoup 的新手,我正在尝试dict从 BeautifulSoup 中提取一个。正如你在我的代码中看到的,我用 python 重新制作了 JSON 代码并保存在 newDict 变量中。但它不起作用。有没有人可以教我,我怎样才能提取那个 JSON 代码?谢谢你。

Sam*_*ats 6

假设上面的脚本在 string 中text,您可以执行以下操作:

import json
from bs4 import BeautifulSoup

soup = BeautifulSoup(text, 'html.parser')
script_text = soup.find('script').get_text()
relevant = script_text[script_text.index('=')+1:] #removes = and the part before it
data = json.loads(relevant) #a dictionary!
print json.dumps(data, indent=4)
Run Code Online (Sandbox Code Playgroud)

输出:

[
    {
        "key": "Owners",
        "bar": true,
        "values": [
            [
                1490400000000,
                1591,
                "",
                "",
                ""
            ],
            [
                1490486400000,
                1924,
                "#2B6A94",
                "",
                ""
            ],
            [
                1490572800000,
                1982,
                "",
                "",
                ""
            ],
            [
                1490659200000,
                1606,
                "",
                "",
                ""
            ]
        ]
    }
]
Run Code Online (Sandbox Code Playgroud)