PyQt4到PyQt5 - > mainFrame()已弃用,需要修复才能加载网页

Les*_*aul 3 python pyqt pyqt4 python-3.x pyqt5

我在做Sentdex的PyQt4的YouTube的教程在这里.我正在尝试跟随,但使用PyQt5.这是一个简单的网络抓取应用程序.我跟着Sentdex的教程,我来到这里:

在此输入图像描述

现在我正在尝试用PyQt5编写相同的应用程序,这就是我所拥有的:

import os
import sys
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl, QEventLoop
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from bs4 import BeautifulSoup
import requests


class Client(QWebEnginePage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebEnginePage.__init__(self)
        self.loadFinished.connect(self._loadFinished)
        self.load(QUrl(url))
        self.app.exec_()

    def _loadFinished(self):
        self.app.quit()


url = 'https://pythonprogramming.net/parsememcparseface/'
client_response = Client(url)

#I think the issue is here at LINE 26
source = client_response.mainFrame().toHtml()

soup = BeautifulSoup(source, "html.parser")
js_test = soup.find('p', class_='jstest')
print(js_test.text)
Run Code Online (Sandbox Code Playgroud)

当我运行它时,我收到消息:

source = client_response.mainFrame().toHtml()
AttributeError: 'Client' object has no attribute 'mainFrame'
Run Code Online (Sandbox Code Playgroud)

我尝试过几种不同的解决方案但没有工作.任何帮助,将不胜感激.

编辑

第15行记录QUrl(url)返回此值:

PyQt5.QtCore.QUrl('https://pythonprogramming.net/parsememcparseface/')

当我尝试source = client_response.load(QUrl(url))第26行时,我最终得到了这样的信息:

File "test3.py", line 28, in <module> soup = BeautifulSoup(source, "html.parser") File "/Users/MYNAME/.venv/qtproject/lib/python3.6/site-packages/bs4/__init__.py", line 192, in __init__ elif len(markup) <= 256 and ( TypeError: object of type 'NoneType' has no len()

当我尝试时,source = client_response.url()我得到:

soup = BeautifulSoup(source, "html.parser")
      File "/Users/MYNAME/.venv/qtproject/lib/python3.6/site-packages/bs4/__init__.py", line 192, in __init__
        elif len(markup) <= 256 and (
    TypeError: object of type 'QUrl' has no len()
Run Code Online (Sandbox Code Playgroud)

小智 14

你必须在QWebEnginePage::toHtml()里面调用类的定义.QWebEnginePage::toHtml()将指针函数或lambda作为参数,此指针函数必须依次采用'str'类型的参数(这是包含页面html的参数).以下是示例代码.

import bs4 as bs
import sys
import urllib.request
from PyQt5.QtWebEngineWidgets import QWebEnginePage
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl

class Page(QWebEnginePage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebEnginePage.__init__(self)
        self.html = ''
        self.loadFinished.connect(self._on_load_finished)
        self.load(QUrl(url))
        self.app.exec_()

    def _on_load_finished(self):
        self.html = self.toHtml(self.Callable)
        print('Load finished')

    def Callable(self, html_str):
        self.html = html_str
        self.app.quit()


def main():
    page = Page('https://pythonprogramming.net/parsememcparseface/')
    soup = bs.BeautifulSoup(page.html, 'html.parser')
    js_test = soup.find('p', class_='jstest')
    print js_test.text

if __name__ == '__main__': main()
Run Code Online (Sandbox Code Playgroud)

  • 如果我只需要获得一页,这很好用。但是如果我创建一个循环,其中每个循环循环都会下载一个页面,python 就会崩溃。知道该怎么做吗?没有任何异常或错误消息 - Python 本身崩溃并且 OSX 提供提交错误报告 (4认同)
  • @KaizerSozay 我有同样的问题,你有没有找到任何解决方案? (2认同)