naq*_*hab 3 python visual-studio scrapy python-3.x visual-studio-code
我在Windows机器上安装了Visual Studio Code,并在上面制作了新的Scrapy Crawler。搜寻器工作正常,但我想调试代码,为此我将其添加到launch.json文件中:
{
"name": "Scrapy with Integrated Terminal/Console",
"type": "python",
"request": "launch",
"stopOnEntry": true,
"pythonPath": "${config:python.pythonPath}",
"program": "C:/Users/neo/.virtualenvs/Gers-Crawler-77pVkqzP/Scripts/scrapy.exe",
"cwd": "${workspaceRoot}",
"args": [
"crawl",
"amazon",
"-o",
"amazon.json"
],
"console": "integratedTerminal",
"env": {},
"envFile": "${workspaceRoot}/.env",
"debugOptions": [
"RedirectOutput"
]
}
Run Code Online (Sandbox Code Playgroud)
但是我无法达到任何断点。PS:我从这里获取了JSON脚本:http : //www.stevetrefethen.com/blog/debugging-a-python-scrapy-project-in-vscode
cpi*_*mtz 25
为了执行典型的scrapy runspider <PYTHON_FILE>命令,必须将以下配置设置到您的launch.json:
{
"version": "0.1.0",
"configurations": [
{
"name": "Python: Launch Scrapy Spider",
"type": "python",
"request": "launch",
"module": "scrapy",
"args": [
"runspider",
"${file}"
],
"console": "integratedTerminal"
}
]
}
Run Code Online (Sandbox Code Playgroud)
在您想要的任何地方设置断点,然后进行调试。
在您的scrapy项目文件夹中,创建一个runner.py包含以下内容的模块:
import os
from scrapy.cmdline import execute
os.chdir(os.path.dirname(os.path.realpath(__file__)))
try:
execute(
[
'scrapy',
'crawl',
'SPIDER NAME',
'-o',
'out.json',
]
)
except SystemExit:
pass
Run Code Online (Sandbox Code Playgroud)在要调试的行中放置一个断点
runner.py使用vscode调试器运行
json像这样配置你的文件:
"version": "0.2.0",
"configurations": [
{
"name": "Crawl with scrapy",
"type": "python",
"request": "launch",
"module": "scrapy",
"cwd": "${fileDirname}",
"args": [
"crawl",
"<SPIDER NAME>"
],
"console": "internalConsole"
}
]
Run Code Online (Sandbox Code Playgroud)
单击 VSCode 中与您的蜘蛛对应的选项卡,然后启动与该json文件对应的调试会话。
小智 5
你也可以尝试使用
{
"configurations": [
{
"name": "Python: Scrapy",
"type": "python",
"request": "launch",
"module": "scrapy",
"cwd": "${fileDirname}",
"args": [
"crawl",
"${fileBasenameNoExtension}",
"--loglevel=ERROR"
],
"console": "integratedTerminal",
"justMyCode": false
}
]
}
Run Code Online (Sandbox Code Playgroud)
但字段的名称应该与蜘蛛的名称相同。
--loglevel=ERROR 是为了获得更简洁的输出;)
| 归档时间: |
|
| 查看次数: |
2598 次 |
| 最近记录: |