Dav*_*rtz 6 linux multithreading boost-asio
Boost的ASIO调度员似乎有一个严重的问题,我似乎无法找到解决方法.问题是,等待分派的唯一线程pthread_cond_wait仍然存在,尽管有待处理的I/O操作需要阻塞epoll_wait.
我可以通过poll_one在循环中调用一个线程直到它返回零来最容易地复制此问题.这可能会使线程调用run陷入困境,pthread_cond_wait而线程调用poll_one会从循环中断开.据推测,io_service期望该线程返回阻止epoll_wait,但它没有义务这样做,并且这种期望似乎是致命的.
是否要求线程与io_services 静态关联?
这是一个显示死锁的示例.这是处理此io_service的唯一线程,因为其他人已经继续.肯定有套接字操作待定:
#0 pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 boost::asio::detail::posix_event::wait<boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex> > (...) at /usr/include/boost/asio/detail/posix_event.hpp:80
#2 boost::asio::detail::task_io_service::do_run_one (...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:405
#3 boost::asio::detail::task_io_service::run (...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:146
Run Code Online (Sandbox Code Playgroud)
我相信错误如下:如果服务于I/O队列的线程是阻塞I/O套接字就绪检查并且调用调度函数的线程,如果在io服务上阻塞了任何其他线程,它必须发出信号.它目前仅表示当时是否有准备好运行的处理程序.但是没有线程检查套接字准备情况.
这是一个错误.我已经能够通过在非关键部分添加延迟来复制它task_io_service::do_poll_one.下面是修改后的片段task_io_service::do_poll_one()在booost/asio/detail/impl/task_io_service.ipp.唯一增加的是睡眠.
std::size_t task_io_service::do_poll_one(mutex::scoped_lock& lock,
task_io_service::thread_info& this_thread,
const boost::system::error_code& ec)
{
if (stopped_)
return 0;
operation* o = op_queue_.front();
if (o == &task_operation_)
{
op_queue_.pop();
lock.unlock();
{
task_cleanup c = { this, &lock, &this_thread };
(void)c;
// Run the task. May throw an exception. Only block if the operation
// queue is empty and we're not polling, otherwise we want to return
// as soon as possible.
task_->run(false, this_thread.private_op_queue);
boost::this_thread::sleep_for(boost::chrono::seconds(3));
}
o = op_queue_.front();
if (o == &task_operation_)
return 0;
}
...
Run Code Online (Sandbox Code Playgroud)
我的测试驱动程序非常基础:
io_service.io_service,并io_service::run()在轮询线程休眠时进行主调用task_io_service::do_poll_one().测试代码:
#include <iostream>
#include <boost/asio/io_service.hpp>
#include <boost/asio/steady_timer.hpp>
#include <boost/chrono.hpp>
#include <boost/thread.hpp>
boost::asio::io_service io_service;
boost::asio::steady_timer timer(io_service);
void arm_timer()
{
std::cout << ".";
std::cout.flush();
timer.expires_from_now(boost::chrono::seconds(3));
timer.async_wait(boost::bind(&arm_timer));
}
int main()
{
// Add asynchronous work loop.
arm_timer();
// Spawn poll thread.
boost::thread poll_thread(
boost::bind(&boost::asio::io_service::poll, boost::ref(io_service)));
// Give time for poll thread service reactor.
boost::this_thread::sleep_for(boost::chrono::seconds(1));
io_service.run();
}
Run Code Online (Sandbox Code Playgroud)
调试:
[twsansbury@localhost bug]$ gdb a.out ... (gdb) r Starting program: /home/twsansbury/dev/bug/a.out [Thread debugging using libthread_db enabled] .[New Thread 0xb7feeb90 (LWP 31892)] [Thread 0xb7feeb90 (LWP 31892) exited]
此时,arm_timer()已打印"." 曾经(当它被武装起来时).poll线程以非阻塞方式为反应堆提供服务,并且op_queue_在空task_operation_的op_queue_时候睡了3秒(将被添加回task_cleanup c退出时的范围).当它op_queue_是空的时,主线程调用io_service::run(),看到它op_queue_是空的,并使它自己first_idle_thread_,它等待它的位置wakeup_event.poll线程完成休眠,然后返回0,主线程等待wakeup_event.
等待10秒后,有足够的时间arm_timer()准备好,我打断调试器:
Program received signal SIGINT, Interrupt. 0x00919402 in __kernel_vsyscall () (gdb) bt #0 0x00919402 in __kernel_vsyscall () #1 0x0081bbc5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00763b3d in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libc.so.6 #3 0x08059dc2 in void boost::asio::detail::posix_event::wait >(boost::asio::detail::scoped_lock&) () #4 0x0805a009 in boost::asio::detail::task_io_service::do_run_one(boost::asio::detail::scoped_lock&, boost::asio::detail::task_io_service_thread_info&, boost::system::error_code const&) () #5 0x0805a11c in boost::asio::detail::task_io_service::run(boost::system::error_code&) () #6 0x0805a1e2 in boost::asio::io_service::run() () #7 0x0804db78 in main ()
并排时间表如下:
poll thread | main thread
---------------------------------------+---------------------------------------
lock() |
do_poll_one() |
|-- pop task_operation_ from |
| queue_op_ |
|-- unlock() | lock()
|-- create task_cleanup | do_run_one()
|-- service reactor (non-block) | `-- queue_op_ is empty
|-- ~task_cleanup() | |-- set thread as idle
| |-- lock() | `-- unlock()
| `-- queue_op_.push( |
| task_operation_) |
`-- task_operation_ is |
queue_op_.front() |
`-- return 0 | // still waiting on wakeup_event
unlock() |
尽我所知,修补没有副作用:
if (o == &task_operation_)
return 0;
Run Code Online (Sandbox Code Playgroud)
至:
if (o == &task_operation_)
{
if (!one_thread_)
wake_one_thread_and_unlock(lock);
return 0;
}
Run Code Online (Sandbox Code Playgroud)
无论如何,我已经提交了一个错误并修复了.考虑一下官方回复的票据.
| 归档时间: |
|
| 查看次数: |
1610 次 |
| 最近记录: |