AttributeError: 'xml.etree.ElementTree.Element' 对象没有属性 'encode'

Question

AttributeError: 'xml.etree.ElementTree.Element' 对象没有属性 'encode'

我正在尝试制作桌面通知程序，为此我正在从网站上抓取新闻。当我运行程序时，出现以下错误。

news[child.tag] = child.encode('utf8')
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'encode'

Run Code Online (Sandbox Code Playgroud)

我该如何解决？我对此完全陌生。我尝试寻找解决方案，但没有一个对我有用。

这是我的代码：

import requests
import xml.etree.ElementTree as ET


# url of news rss feed
RSS_FEED_URL = "http://www.hindustantimes.com/rss/topnews/rssfeed.xml"


def loadRSS():
    '''
    utility function to load RSS feed
    '''
    # create HTTP request response object
    resp = requests.get(RSS_FEED_URL)
    # return response content
    return resp.content


def parseXML(rss):
    '''
    utility function to parse XML format rss feed
    '''
    # create element tree root object
    root = ET.fromstring(rss)
    # create empty list for news items
    newsitems = []
    # iterate news items
    for item in root.findall('./channel/item'):
        news = {}
        # iterate child elements of item
        for child in item:
            # special checking for namespace object content:media
            if child.tag == '{http://search.yahoo.com/mrss/}content':
                news['media'] = child.attrib['url']
            else:
                news[child.tag] = child.encode('utf8')
        newsitems.append(news)
    # return news items list
    return newsitems


def topStories():
    '''
    main function to generate and return news items
    '''
    # load rss feed
    rss = loadRSS()
    # parse XML
    newsitems = parseXML(rss)
    return newsitems

Run Code Online (Sandbox Code Playgroud)

Answer 1

Kev*_*ase 2

您正在尝试将 a 转换str为bytes，然后将这些字节存储在字典中。\n问题是您正在执行此操作的对象是\n xml.etree.ElementTree.Element，\n而不是str。

\n\n

您可能想从该元素内部或周围获取文本，然后获取encode() 该元素。\n文档\n建议使用\n itertext()\n方法：

\n\n

\'\'.join(child.itertext())\n

Run Code Online (Sandbox Code Playgroud)\n\n

这将评估为 a str，然后您就可以了encode()。

\n\n

请注意，\ntext和tail属性\n可能不包含文本\n（添加了强调）：

\n\n

\n
它们的值通常是字符串，但也可以是任何特定于应用程序的对象。
\n

\n\n

如果您想使用这些属性，则必须处理None非字符串值：

\n\n

head = \'\' if child.text is None else str(child.text)\ntail = \'\' if child.text is None else str(child.text)\n# Do something with head and tail...\n

Run Code Online (Sandbox Code Playgroud)\n\n

即使这还不够。\n如果text或tail包含bytes一些意外\n（或完全错误）\n编码的对象，这将引发UnicodeEncodeError.

\n\n

字符串与字节

\n\n

我建议将文本保留为str，而不对其进行编码。\n将文本编码到bytes对象是将文本写入二进制文件、网络套接字或其他硬件之前的最后一步。

\n\n

有关字节和字符之间差异的更多信息，请参阅 Ned Batchelder 的\n“实用 Unicode，或者，如何停止痛苦？ ”\n（来自 PyCon US 2012 的36 分钟视频）。\n他涵盖了 Python 2和 3.

\n\n

示例输出

\n\n

使用该child.itertext()方法，而不是对字符串进行编码，我从以下位置获得了一个看起来合理的字典列表topStories()：

\n\n

[\n  ...,\n  {\'description\': \'Ayushmann Khurrana says his five-year Bollywood journey has \'\n                  \'been \xe2\x80\x9ca fun ride\xe2\x80\x9d; adds success is a lousy teacher while \'\n                  \'failure is \xe2\x80\x9cyour friend, philosopher and guide\xe2\x80\x9d.\',\n    \'guid\': \'http://www.hindustantimes.com/bollywood/i-am-a-hardcore-realist-and-that-s-why-i-feel-my-journey-has-been-a-joyride-ayushmann-khurrana/story-KQDR7gMuvhD9AeQTA7tbmI.html\',\n    \'link\': \'http://www.hindustantimes.com/bollywood/i-am-a-hardcore-realist-and-that-s-why-i-feel-my-journey-has-been-a-joyride-ayushmann-khurrana/story-KQDR7gMuvhD9AeQTA7tbmI.html\',\n    \'media\': \'http://www.hindustantimes.com/rf/image_size_630x354/HT/p2/2017/06/26/Pictures/actor-ayushman-khurana_24f064ae-5a5d-11e7-9d38-39c470df081e.JPG\',\n    \'pubDate\': \'Mon, 26 Jun 2017 10:50:26 GMT \',\n    \'title\': "I am a hardcore realist, and that\'s why I&thinsp;feel my journey "\n             \'has been a joyride: Ayushmann...\'},\n]\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	8 年，5 月前
查看次数：	12237 次
最近记录：	8 年，5 月前