Flo*_*eau 5 python selenium python-2.7 docker
两个多小时以来,我尝试在 un 容器 alpine 上使用 chrome 在 python 中设置 Selenium。我不知道为什么会出现此错误消息:
browser = webdriver.Chrome()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 68, in __init__
self.service.start()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/common/service.py", line 83, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home
Run Code Online (Sandbox Code Playgroud)
有人可以帮助我吗?非常感谢
PS:这是我的 dockerfile 的一部分
RUN wget "https://chromedriver.storage.googleapis.com/2.36/chromedriver_linux64.zip" &&\
busybox unzip chromedriver_linux64.zip &&\
chmod a+x chromedriver &&\
mv chromedriver /usr/bin/
Run Code Online (Sandbox Code Playgroud)
这是我的方法:
def __init__(self, url, parser = "lxml") :
self.url = url
self.parser = parser
browser = webdriver.Chrome()
browser.get(self.url)
...
Run Code Online (Sandbox Code Playgroud)
ps:dockerfile:
FROM alpine:3.7
RUN apk add --update bash &&\
apk update &&\
apk upgrade
RUN apk add --no-cache python-dev ;\
apk add --no-cache python
#telecharge lib python scraper
RUN apk add --no-cache py-pip &&\
apk add --no-cache linux-headers &&\
apk add --no-cache texinfo &&\
apk add --no-cache gcc &&\
apk add --no-cache g++ &&\
apk add --no-cache gfortran &&\
apk add --no-cache libxml2-dev &&\
apk add --no-cache xmlsec-dev &&\
apk add --no-cache py-requests &&\
apk add --no-cache chromium &&\
apk add --no-cache chromium-chromedriver
#install lib python scraper
RUN pip install beautifulsoup4 &&\
pip install requests &&\
pip install lxml &&\
pip install html5lib &&\
pip install urllib3 &&\
pip install -U selenium
#telecharge driver pour selenium
RUN wget "https://chromedriver.storage.googleapis.com/2.36/chromedriver_linux64.zip" &&\
busybox unzip chromedriver_linux64.zip &&\
chmod a+x chromedriver &&\
mv chromedriver /usr/bin/
# prepare le shell
CMD ["bash"]
WORKDIR "/root"
Run Code Online (Sandbox Code Playgroud)
我没有使用 Chrome,而是使用了 Firefox,这是我的全部 dockerfile。这对我来说是工作。我希望这会对你有所帮助。玩得开心
FROM alpine:3.7
RUN apk add --no-cache bash &&\
apk update &&\
apk upgrade
ENV PATH /usr/local/bin:$PATH
RUN apk add --no-cache make &&\
apk add --no-cache python3-dev &&\
apk add --no-cache python3 &&\
apk add --no-cache firefox-esr &&\
apk add --no-cache wget &&\
apk add --no-cache git &&\
apk add --no-cache icu-libs &&\
apk add --no-cache xvfb &&\
apk add --no-cache linux-headers &&\
apk add --no-cache texinfo &&\
apk add --no-cache gcc &&\
apk add --no-cache g++ &&\
apk add --no-cache gfortran &&\
apk add --no-cache libxml2-dev &&\
apk add --no-cache xmlsec-dev &&\
apk add --no-cache py-requests &&\
apk add --no-cache qt-dev &&\
apk add --no-cache openjdk7-jre &&\
apk add --no-cache dbus-x11 &&\
apk add --no-cache ttf-freefont &&\
rm -rf /var/cache/apk/*
#python
RUN python3 -m ensurepip &&\
rm -r /usr/lib/python*/ensurepip &&\
pip3 install --upgrade pip setuptools &&\
if [ ! -e /usr/bin/pip ]; then ln -s pip3 /usr/bin/pip ; fi &&\
if [[ ! -e /usr/bin/python ]]; then ln -sf /usr/bin/python3 /usr/bin/python; fi &&\
rm -r /root/.cache
#firefox
RUN rm -rf /tmp/* /var/cache/apk/* &&\
wget "https://github.com/mozilla/geckodriver/releases/download/v0.19.1/geckodriver-v0.19.1-linux64.tar.gz" &&\
tar -xvf geckodriver-v0.19.1-linux64.tar.gz &&\
rm -rf geckodriver-v0.19.1-linux64.tar.gz &&\
chmod a+x geckodriver &&\
mv geckodriver /usr/local/bin/
#selenium
RUN pip install "selenium<3" &&\
pip install virtualenv &&\
pip install pyvirtualdisplay
#X server
RUN git clone "https://github.com/niklasb/webkit-server.git" &&\
cd webkit-server &&\
python setup.py install
ADD start_script.sh /tmp/start_script.sh
RUN chmod +x /tmp/start_script.sh
#mysql
RUN apk add --no-cache mariadb-dev
RUN pip install mysqlclient
# prepare le shell
RUN mkdir /var/shared
WORKDIR "/var/shared"
CMD ["/tmp/start_script.sh"]
Run Code Online (Sandbox Code Playgroud)
和启动脚本:
#!/bin/sh
Xvfb :00 &
export DISPLAY=:00
python3 scraper/main.py
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
3863 次 |
最近记录: |