小编Cha*_*yer的帖子

Python http.client.Incomplete Read(0 bytes read) 错误

我在论坛上看到了这个错误并阅读了回复,但我仍然不明白它是什么或如何解决它。我正在从互联网上的 16k 个链接中抓取数据,我的脚本从每个链接中抓取类似的信息并将其写入 .csv 中,其中一些日期是在此错误之前写入的。

Traceback (most recent call last):
 File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 541, in _get_chunk_left
   chunk_left = self._read_next_chunk_size()
 File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 508, in _read_next_chunk_size
   return int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 558, in _readall_chunked
   chunk_left = self._get_chunk_left()
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 543, in _get_chunk_left
   raise IncompleteRead(b'')
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception …
Run Code Online (Sandbox Code Playgroud)

urllib beautifulsoup web-scraping python-3.x

9
推荐指数
1
解决办法
1万
查看次数

如何使用BeautifulSoup在href标签下面拉<b> text </ b>?

我试图找到一种方法来拉出一些链接及其相关的文字与美丽的汤.HTML如下:

<tr>
    <td align="left" bgcolor="#ffff99">
        <font size="2">
            <a href="link/I/Want.htm">
                <b>Text I Want</b>
            </a>
        </font>
     </td>

<tr>
    <td align="left" bgcolor="#ffff99">
        <font size="2">
            <a href="link/I/Want.htm2">
                <b>Text I Want2</b>
            </a>
        </font>
     </td>
Run Code Online (Sandbox Code Playgroud)

我可以拉链接没问题:

soup.find_all('a', href=re.compile('link/I/Want'))
Run Code Online (Sandbox Code Playgroud)

但是我希望能够拉动文本并将其与链接相关联.要么让它们在列表中背靠背,要么将它们放在相同顺序的单独列表中,这样我就可以使用zip()函数.

python beautifulsoup web-scraping python-3.x

3
推荐指数
1
解决办法
190
查看次数

标签 统计

beautifulsoup ×2

python-3.x ×2

web-scraping ×2

python ×1

urllib ×1