Vol*_*il3 3 python python-3.x python-3.9
我刚刚从 Python 3.7 升级到 3.9.14,现在出现变量未定义错误。相同的代码在安装了 Python 3.9.2 的本地和远程工作正常,但现在在本地 Python 3.9.14 版本中出现错误。下面是代码:
def check(url):
result = None
product = Product(url, user_agents)
if product.is_connected():
result = product.parse()
return result
Run Code Online (Sandbox Code Playgroud)
if __name__ == '__main__':
user_agents = []
with open('user-agents.txt', encoding='utf8') as f:
user_agents = f.readlines()
if len(links) > 0:
print('Starting with the Pool count = ', PRODUCT_POOL_COUNT)
with Pool(PRODUCT_POOL_COUNT) as p:
result = p.map(check, links)
result = list(filter(None, result)) # Remove Empty
Run Code Online (Sandbox Code Playgroud)
以下是错误消息:
Traceback (most recent call last):
File "/Users/Me/.pyenv/versions/3.9.14/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/Users/Me/.pyenv/versions/3.9.14/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/Users/Me/Data/Clients/App/Etsy/products/parse_product.py", line 12, in check
product = Product(url, user_agents)
NameError: name 'user_agents' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/Me/Data/Clients/App/Etsy/products/parse_product.py", line 126, in <module>
result = p.map(check, links)
File "/Users/Me/.pyenv/versions/3.9.14/lib/python3.9/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/Users/Me/.pyenv/versions/3.9.14/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
NameError: name 'user_agents' is not defined
Run Code Online (Sandbox Code Playgroud)
您必须使用 OS-X。在该操作系统上,Python 在此版本中更改了生成子进程的默认方法 - 这也解释了为什么它“在 Python 3.9 上远程工作”:远程部署必须在 Linux 或除 MacOS 之外的其他 Unix 上 -
请耐心等待:对于所有 Unix,默认的子进程创建方法曾经是“fork” - 当使用“fork”时,新进程是其父进程的精确副本,包括所有声明的全局变量 - 因此全局变量user_agents存在并且在目标函数中可见。
OS-X 的新方法是“spawn”:新进程启动所有项目代码,并重新执行所有行,但对于由语句保护的行if __name__ == "__main__": :在子进程中,变量__name__包含module实际名称,因为它不再是__main__正在运行的Python程序的模块(原始进程是)。
除了这种改变的动机(它们很容易搜索)之外,修复很简单:只需在受保护的块之外声明全局变量:
(...)
user_agents = open("user-agents.txt", encoding="utf-8").readlines()
if __name__ == '__main__':
if len(links) > 0:
print('Starting with the Pool count = ', PRODUCT_POOL_COUNT)
with Pool(PRODUCT_POOL_COUNT) as p:
result = p.map(check, links)
result = list(filter(None, result)) # Remove Empty
Run Code Online (Sandbox Code Playgroud)
(此外,如果您在单个 glob 中读取文件,则无需执行“打开”文件的所有操作)
| 归档时间: |
|
| 查看次数: |
148 次 |
| 最近记录: |