1 /我正在尝试使用美丽的汤提取脚本的一部分,但它打印无.怎么了 ?
URL = "http://www.reuters.com/video/2014/08/30/woman-who-drank-restaurants-tainted-tea?videoId=341712453"
oururl= urllib2.urlopen(URL).read()
soup = BeautifulSoup(oururl)
for script in soup("script"):
script.extract()
list_of_scripts = soup.findAll("script")
print list_of_scripts
Run Code Online (Sandbox Code Playgroud)
2 /目标是提取属性"transcript"的值:
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "VideoObject",
"video": {
"@type": "VideoObject",
"headline": "Woman who drank restaurant's tainted tea hopes for industry...",
"caption": "Woman who drank restaurant's tainted tea hopes for industry...",
"transcript": "Jan Harding is speaking out for the first time about the ordeal that changed her life. SOUNDBITE: JAN HARDING, DRANK TAINTED TEA, SAYING: \"Immediately my …
Run Code Online (Sandbox Code Playgroud)