在Visual Studio Code中调试Scrapy项目

naq*_*hab 3 python visual-studio scrapy python-3.x visual-studio-code

我在Windows机器上安装了Visual Studio Code,并在上面制作了新的Scrapy Crawler。搜寻器工作正常,但我想调试代码,为此我将其添加到launch.json文件中:

{
    "name": "Scrapy with Integrated Terminal/Console",
    "type": "python",
    "request": "launch",
    "stopOnEntry": true,
    "pythonPath": "${config:python.pythonPath}",
    "program": "C:/Users/neo/.virtualenvs/Gers-Crawler-77pVkqzP/Scripts/scrapy.exe",
    "cwd": "${workspaceRoot}",
    "args": [
        "crawl",
        "amazon",
        "-o",
        "amazon.json"
    ],
    "console": "integratedTerminal",
    "env": {},
    "envFile": "${workspaceRoot}/.env",
    "debugOptions": [
        "RedirectOutput"
    ]
}
Run Code Online (Sandbox Code Playgroud)

但是我无法达到任何断点。PS:我从这里获取了JSON脚本:http : //www.stevetrefethen.com/blog/debugging-a-python-scrapy-project-in-vscode

cpi*_*mtz 25

为了执行典型的scrapy runspider <PYTHON_FILE>命令,必须将以下配置设置到您的launch.json

{
    "version": "0.1.0",
    "configurations": [
        {
            "name": "Python: Launch Scrapy Spider",
            "type": "python",
            "request": "launch",
            "module": "scrapy",
            "args": [
                "runspider",
                "${file}"
            ],
            "console": "integratedTerminal"
        }
    ]
}
Run Code Online (Sandbox Code Playgroud)

在您想要的任何地方设置断点,然后进行调试。

  • 这应该是公认的答案。在工作项目的“launch.json”中添加特定于 scrapy 的配置是一种很好的做法,易于实现,并且不需要创建额外的脚本。 (3认同)

fma*_*gno 8

  1. 在您的scrapy项目文件夹中,创建一个runner.py包含以下内容的模块:

    import os
    from scrapy.cmdline import execute
    
    os.chdir(os.path.dirname(os.path.realpath(__file__)))
    
    try:
        execute(
            [
                'scrapy',
                'crawl',
                'SPIDER NAME',
                '-o',
                'out.json',
            ]
        )
    except SystemExit:
        pass
    
    Run Code Online (Sandbox Code Playgroud)
  2. 在要调试的行中放置一个断点

  3. runner.py使用vscode调试器运行


Man*_*EPA 5

json像这样配置你的文件:

"version": "0.2.0",
"configurations": [
    {
        "name": "Crawl with scrapy",
        "type": "python",
        "request": "launch",
        "module": "scrapy",
        "cwd": "${fileDirname}",
        "args": [
            "crawl",
            "<SPIDER NAME>"
        ],
        "console": "internalConsole"
    }
]
Run Code Online (Sandbox Code Playgroud)

单击 VSCode 中与您的蜘蛛对应的选项卡,然后启动与该json文件对应的调试会话。


小智 5

你也可以尝试使用

{
  "configurations": [
    {
        "name": "Python: Scrapy",
        "type": "python",
        "request": "launch",
        "module": "scrapy",
        "cwd": "${fileDirname}",
        "args": [
            "crawl",
            "${fileBasenameNoExtension}",
            "--loglevel=ERROR"
        ],
        "console": "integratedTerminal",
        "justMyCode": false
    }
  ]
}
Run Code Online (Sandbox Code Playgroud)

但字段的名称应该与蜘蛛的名称相同。

--loglevel=ERROR 是为了获得更简洁的输出;)