JC1*_*JC1 6 python postgresql psycopg2
我试图通过 同时将项目插入到 postgres 表中ThreadedConnectionPool,但我不断收到psycopg2.pool.PoolError: trying to put unkeyed connection- 不知道为什么会发生这种情况。我也尝试过按顺序运行它,但仍然遇到相同的错误。
本质上,该代码会抓取网站的产品站点地图,并将抓取的项目插入数据库中。
代码:
class items:
def __init__(self):
self.conn = ThreadedConnectionPool(10, 100, dbname='postgres', user='xxx', password='xxx', host='xxx')
self.url = "some url"
self.session = requests.Session()
def scrape(self, pageNo):
//some logic
self.page(pageNo)
// scrapes specified page from sitemap
def page(self, page):
resp = self.session.get(self.mens+"?page="+str(page)).json()
products = resp['products']
ts = []
for item in products:
# self.indivProduct(self.url + pageNo)
t = threading.Thread(target=self.indivProduct, args=self.url + pageNo,))
ts.append(t)
t.start()
for item in ts:
item.join()
def indivProduct(self, url):
conn = self.conn.getconn()
cursor = conn.cursor()
// Some logic with requests
try:
sql = 'insert into "Output" ' \
'("productID", "brand", "categoryID", "productName", "price", "sizeInfo", "SKU", "URL", "dateInserted", "dateUpdated")' \
'values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)'
cursor.execute(sql,
(.., .., ..,))
conn.commit()
except IntegrityError:
conn.rollback()
sql = 'insert into "Output" ' \
'("productID", "brand", "categoryID", "productName", "price", "sizeInfo", "SKU", "URL", "dateInserted", "dateUpdated")' \
'values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s) on conflict ("productID") do update set "dateUpdated" = EXCLUDED."dateUpdated"'
cursor.execute(sql,
(.., .., ..,))
conn.commit()
except Exception as e:
print(e)
print()
finally:
self.conn.putconn()
Run Code Online (Sandbox Code Playgroud)
主要的:
s = items()
s.scrape(3)
Run Code Online (Sandbox Code Playgroud)
小智 2
由于您将 None 传递给 putconn() 函数,因此您会看到此错误。来源可见: https://github.com/psycopg/psycopg2/blob/master/lib/pool.py
你应该将你的finally块调整为:
finally:
cursor.close()
self.conn.putconn(conn)
Run Code Online (Sandbox Code Playgroud)
我在强制连接池刷新后遇到了错误,并且有一行尝试在旧池的连接上调用 putconn(conn) 。