标签: dryscrape

pip install dryscrape失败,显示"错误:[Errno 2]没有这样的文件或目录:'src/webkit_server'"?

我需要为python安装dryscrape但我得到错误,有什么问题?

C:\Users\parvij\Anaconda3\Scripts>pip install dryscrape
Run Code Online (Sandbox Code Playgroud)

我懂了:

Collecting dryscrape
Collecting webkit-server>=1.0 (from dryscrape)
  Using cached webkit-server-1.0.tar.gz
Collecting xvfbwrapper (from dryscrape)
Requirement already satisfied (use --upgrade to upgrade): lxml in c:\users\parvij\anaconda3\lib\site-packages (from dryscrape)
Building wheels for collected packages: webkit-server
  Running setup.py bdist_wheel for webkit-server ... error
  Complete output from command c:\users\parvij\anaconda3\python.exe -u -c"import setuptools,tokenize;__file__='C:\\Users\\parvij\\AppData\\Local\\Temp\\pip-build-o7nlv0dz\\webkit-server\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d C:\Users\parvij\AppData\Local\Temp\tmp71w59qv6pip-wheel- --python-tag cp35:
  running bdist_wheel
  running build
  'make' is not recognized as an internal or external command,
  operable program or batch file. …
Run Code Online (Sandbox Code Playgroud)

python install package python-3.x dryscrape

13
推荐指数
1
解决办法
9171
查看次数

用Dryscrape和BeautifulSoup刮网

我正在尝试从Yahoo抓取一些数据。我写了一个可行的脚本-在某些时候。有时,当我运行脚本时,我能够下载整个页面-有时,页面仅被部分加载-缺少数据部分。

更令人困惑的是,当我在浏览器中导航到该页面时,将显示整个页面。

这是我的代码的要点:

import dryscrape
from bs4 import BeautifulSoup

url =  'http://finance.yahoo.com/quote/SPY/options?p=SPY&straddle=false'

sess = dryscrape.Session()

sess.set_header('user-agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0')

sess.set_attribute('auto_load_images', False)          
sess.set_timeout(360)

sess.visit(url)

soup = BeautifulSoup(sess.body(), 'lxml')

# Related to memory leak issue in webkit
sess.reset()

# Barfs (sometimes!) at the line below
sel_list = soup.find('select', class_='Fz(s)')

if sel_list is None or len(sel_list) == 0:
    print('element not found on page!')
Run Code Online (Sandbox Code Playgroud)

我已附上以下页面的图像。这是通过Web浏览器在Internet上查看时的网页:

页面显示数据

现在,这是我通过类似于上面显示的脚本下拉的页面-它没有数据!:

抓取下载的页面-没有数据!

任何人都可以弄清楚为什么在我的脚本中获取数据时有时会丢失该元素的原因吗?同样(更多?)重要的是,我该如何解决?

python lxml beautifulsoup web-scraping dryscrape

2
推荐指数
1
解决办法
2556
查看次数