jos*_*eph 7 html python select beautifulsoup
<select>
<option value="0">2002/12</option>
<option value="1">2003/12</option>
<option value="2">2004/12</option>
<option value="3">2005/12</option>
<option value="4">2006/12</option>
<option value="5" selected>2007/12</option>
</select>
Run Code Online (Sandbox Code Playgroud)
有了这段代码,我需要的价值'0'不是文字'2002/12'
我尝试了很多的BS4选择,.stripped_strings,.strip(),.contents,get(),等.
我怎样才能获得价值而不是文字?
Mar*_*ers 20
你想要这个value 属性 ; 使用映射语法访问标记属性:
option['value']
Run Code Online (Sandbox Code Playgroud)
演示:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <select>
... <option value="0">2002/12</option>
... <option value="1">2003/12</option>
... <option value="2">2004/12</option>
... <option value="3">2005/12</option>
... <option value="4">2006/12</option>
... <option value="5" selected>2007/12</option>
... </select>
... ''')
>>> for option in soup.find_all('option'):
... print 'value: {}, text: {}'.format(option['value'], option.text)
...
value: 0, text: 2002/12
value: 1, text: 2003/12
value: 2, text: 2004/12
value: 3, text: 2005/12
value: 4, text: 2006/12
value: 5, text: 2007/12
Run Code Online (Sandbox Code Playgroud)
像这样:
>>> import BeautifulSoup
>>> doc = """
... <select>
... <option value="0">2002/12</option>
... <option value="1">2003/12</option>
... <option value="2">2004/12</option>
... <option value="3">2005/12</option>
... <option value="4">2006/12</option>
... <option value="5" selected>2007/12</option>
... </select>
... """
>>> soup = BeautifulSoup.BeautifulSoup(doc)
>>> list = soup.findAll('option')
>>> for l in list:
... print l['value']
...
0
1
2
3
4
5
>>>
Run Code Online (Sandbox Code Playgroud)