and*_*voy 0 python beautifulsoup
我知道怎么去找到所有链接,但我想在链接后立即发送文本.
例如,在给定的html中:
<p><a href="/cgi-bin/bdquery/?&Db=d106&querybd=@FIELD(FLD004+@4((@1(Rep+Armey++Richard+K.))+00028))">Rep Armey, Richard K.</a> [TX-26]
- 11/9/1999
<br/><a href="/cgi-bin/bdquery/?&Db=d106&querybd=@FIELD(FLD004+@4((@1(Rep+Davis++Thomas+M.))+00274))">Rep Davis, Thomas M.</a> [VA-11]
- 11/9/1999
<br/><a href="/cgi-bin/bdquery/?&Db=d106&querybd=@FIELD(FLD004+@4((@1(Rep+DeLay++Tom))+00282))">Rep DeLay, Tom</a> [TX-22]
- 11/9/1999
Run Code Online (Sandbox Code Playgroud)
......(这重复了很多次)
我想提取[CA-28] - 11/9/1999与之相关的内容<a href=... >Rep Dreier, David</a>
并为列表中的所有链接执行此操作
可能有一个更漂亮的方式,但我通常链.next:
>>> soup.find_all("a")
[<a href="/cgi-bin/bdquery/?&Db=d106&querybd=@FIELD(FLD004+@4((@1(Rep+Armey++Richard+K.))+00028))">Rep Armey, Richard K.</a>, <a href="/cgi-bin/bdquery/?&Db=d106&querybd=@FIELD(FLD004+@4((@1(Rep+Davis++Thomas+M.))+00274))">Rep Davis, Thomas M.</a>, <a href="/cgi-bin/bdquery/?&Db=d106&querybd=@FIELD(FLD004+@4((@1(Rep+DeLay++Tom))+00282))">Rep DeLay, Tom</a>]
>>> [a.next for a in soup.find_all("a")]
[u'Rep Armey, Richard K.', u'Rep Davis, Thomas M.', u'Rep DeLay, Tom']
>>> [a.next.next for a in soup.find_all("a")]
[u' [TX-26]\n - 11/9/1999\n', u' [VA-11]\n - 11/9/1999\n', u' [TX-22]\n - 11/9/1999']
>>> {a.next: a.next.next for a in soup.find_all("a")}
{u'Rep Davis, Thomas M.': u' [VA-11]\n - 11/9/1999\n', u'Rep DeLay, Tom': u' [TX-22]\n - 11/9/1999', u'Rep Armey, Richard K.': u' [TX-26]\n - 11/9/1999\n'}
Run Code Online (Sandbox Code Playgroud)
等等