如何使用 pybind11 在 C++ 线程内调用 Python 函数作为回调

Pta*_*666 8 c++ python gil pybind11

我设计了一个 C++ 系统,它从在单独线程中运行的过程调用用户定义的回调。简化后system.hpp如下所示:

#pragma once

#include <atomic>
#include <chrono>
#include <functional>
#include <thread>

class System
{
public:
  using Callback = std::function<void(int)>;
  System(): t_(), cb_(), stop_(true) {}
  ~System()
  {
    stop();
  }
  bool start()
  {
    if (t_.joinable()) return false;
    stop_ = false;
    t_ = std::thread([this]()
    {
      while (!stop_)
      {
        std::this_thread::sleep_for(std::chrono::milliseconds(100));
        if (cb_) cb_(1234);
      }
    });
    return true;
  }
  bool stop()
  {
    if (!t_.joinable()) return false;
    stop_ = true;
    t_.join();
    return true;
  }
  bool registerCallback(Callback cb)
  {
    if (t_.joinable()) return false;
    cb_ = cb;
    return true;
  }

private:
  std::thread t_;
  Callback cb_;
  std::atomic_bool stop_;
};
Run Code Online (Sandbox Code Playgroud)

它工作得很好,可以用这个简短的例子进行测试main.cpp

#include <iostream>
#include "system.hpp"

int g_counter = 0;

void foo(int i)
{
  std::cout << i << std::endl;
  g_counter++;
}

int main()
{
  System s;
  s.registerCallback(foo);
  s.start();
  while (g_counter < 3)
  {
    std::this_thread::sleep_for(std::chrono::milliseconds(1));
  }
  s.stop();
  return 0;
}
Run Code Online (Sandbox Code Playgroud)

它将输出1234几次然后停止。然而,我在尝试为我的System. 如果我将 python 函数注册为回调,我的程序将在调用后死锁System::stop。我对这个主题进行了一些研究,看来我面临着GIL 的问题。可重现的例子:

binding.cpp:

#include "pybind11/functional.h"
#include "pybind11/pybind11.h"

#include "system.hpp"

namespace py = pybind11;

PYBIND11_MODULE(mysystembinding, m) {
  py::class_<System>(m, "System")
    .def(py::init<>())
    .def("start", &System::start)
    .def("stop", &System::stop)
    .def("registerCallback", &System::registerCallback);
}
Run Code Online (Sandbox Code Playgroud)

蟒蛇脚本:

#!/usr/bin/env python

import mysystembinding
import time

g_counter = 0

def foo(i):
  global g_counter
  print(i)
  g_counter = g_counter + 1

s = mysystembinding.System()
s.registerCallback(foo)
s.start()
while g_counter < 3:
  time.sleep(1)
s.stop()
Run Code Online (Sandbox Code Playgroud)

我已阅读pybind11 文档部分,了解在 C++ 端获取或释放 GIL 的可能性。然而,我未能摆脱我的案例中出现的僵局:

PYBIND11_MODULE(mysystembinding, m) {
  py::class_<System>(m, "System")
    .def(py::init<>())
    .def("start", &System::start)
    .def("stop", &System::stop)
    .def("registerCallback", [](System* s, System::Callback cb)
      {
        s->registerCallback([cb](int i)
        {
          // py::gil_scoped_acquire acquire;
          // py::gil_scoped_release release;
          cb(i);
        });
      });
}
Run Code Online (Sandbox Code Playgroud)

如果我py::gil_scoped_acquire acquire;在调用回调之前调用,无论如何都会发生死锁。如果我py::gil_scoped_release release;在调用回调之前调用,我会得到

致命的 Python 错误:PyEval_SaveThread:NULL tstate

我应该怎么做才能将 python 函数注册为回调并避免死锁?

Pta*_*666 5

感谢这个讨论和许多其他资源(1、2、3 我发现保护启动和加入 C++ 线程的函数似乎可以解决问题:gil_scoped_release

PYBIND11_MODULE(mysystembinding, m) {
  py::class_<System>(m, "System")
    .def(py::init<>())
    .def("start", &System::start, py::call_guard<py::gil_scoped_release>())
    .def("stop", &System::stop, py::call_guard<py::gil_scoped_release>())
    .def("registerCallback", &System::registerCallback);
}
Run Code Online (Sandbox Code Playgroud)

显然,发生死锁是因为 python 在调用负责 C++ 线程操作的绑定时持有锁。我仍然不确定我的推理是否正确,所以我将不胜感激任何专家的评论。