循环遍历pickle读取的列表以查找userid

AEA*_*AEA 3 python loops list pickle python-2.7

我在浏览pickle读取的列表时遇到问题.此代码的最终目的是遍历每个项目并返回每个项目的ID号.

## Opening the file, and loading it into a list##
with open('TEMP_ITEMS.txt', 'rb') as openfile:
    items = pickle.load(openfile)
Run Code Online (Sandbox Code Playgroud)

我试图遍历这个并找到id号的尝试是基于一些旧的xml抓取技术,但由于某种原因逻辑不适用于此.

for item in enumerate(items):

    pattern0 = re.compile('ID: (.*?) <br>')
    idnumber = float(re.findall(pattern0, items[0])[0])
    print "ID Number: ",idnumber 
Run Code Online (Sandbox Code Playgroud)

TEMP_ITEMS.txt的内容示例

(lp0
S'\n                <item>\n                    <title>Timmy</title>\n                    <link>caturl</link>\n                    <description><![CDATA[\n                                Timmy <br>\n                                ID: 3712 <br>\n                                Age: 10 <br>\n                                Weight: 7lbs <br>\n                                Time: 17:23 <br>\n                                Cat Name: Timmy <br>\n\n                    ]]></description>\n                    <guid isPermaLink="false">04e72b29-065d-4893-a4d2-f16ff30a283e</guid>\n                    <pubDate>Fri, 21 Jun 2013 01:09:05 GMT</pubDate>\n                </item>'
p1
aS'\n                <item>\n                    <title>George</title>\n                    <link>caturl</link>\n                    <description><![CDATA[\n                                George <br>\n                                ID: 4124 <br>\n                                Age: 14 <br>\n                                Weight: 8lbs <br>\n                                Time: 15:41 <br>\n                                Cat Name: George <br>\n\n                    ]]></description>\n                    <guid isPermaLink="false">212f9fbf-564b-470a-a64a-ef51036ff06a</guid>\n                    <pubDate>Fri, 21 Jun 2013 01:28:20 GMT</pubDate>\n                </item>'
p2
a.
Run Code Online (Sandbox Code Playgroud)

任何有关此问题的帮助或建议将不胜感激.亲切的问候AEA

在falsetru的建议下使用的代码,它返回错误

import pickle
import re

with open('TEMP_RSS_ITEMS.txt', 'rb') as temp_rss_items_open4:
    items = pickle.load(temp_rss_items_open4)        
    print items
    for item in enumerate(items):
        pattern0 = re.compile('ID: (.*) <br>')
        for idnumber in re.findall(pattern0, item):
            print idnumber
Run Code Online (Sandbox Code Playgroud)

它产生的代码出错:

Traceback (most recent call last):
  File "C:/Sharing/test1.py", line 9, in <module>
    for idnumber in re.findall(pattern0, item):
  File "C:\Python27\lib\re.py", line 177, in findall
    return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
>>> 
Run Code Online (Sandbox Code Playgroud)

fal*_*tru 6

尝试使用非贪婪的版本.*:

pattern0 = re.complie(r'ID: (.*?) <br>')
Run Code Online (Sandbox Code Playgroud)

或'+`如果ID只有数字:

pattern0 = re.complie(r'ID: (\d+)')
Run Code Online (Sandbox Code Playgroud)

UPDATE

import pickle
import re

pattern0 = re.compile('ID: (.*) <br>')
with open('TEMP_RSS_ITEMS.txt', 'rb') as f:
    items = pickle.load(f)        
    for item in items:
        for idnumber in pattern0.findall(item):
            print idnumber
Run Code Online (Sandbox Code Playgroud)