使用boost :: python时,将python.io对象转换为std :: istream

Pet*_*pov 3 c++ python io boost-python

在编写我的第一个django应用程序时,我遇到了boost :: python的以下问题.从python代码,我需要将io.BytesIO传递给带有std :: istream的C++类.

我有一个遗留的C++库,用于读取某种格式的文件.我们打电话是somelib.该库的接口使用std :: istream作为输入.像这样的东西:

class SomeReader
{
public:
    bool read_from_stream(std::istream&);
};
Run Code Online (Sandbox Code Playgroud)

我想包装它,以便我可以通过以下方式使用python中的lib:

reader = somelib.SomeReader()
print ">>Pyhton: reading from BytesIO"
buf = io.BytesIO("Hello Stack Overflow")
reader.read(buf)
Run Code Online (Sandbox Code Playgroud)

我发现了如何为实际的python文件对象做到这一点.但目前尚不清楚如何为任意文件类对象做这件事.这是我到目前为止的python绑定的定义:

using namespace boost::python;
namespace io = boost::iostreams;

struct SomeReaderWrap: SomeReader, wrapper<SomeReader>
{
    bool read(object &py_file)
    {
        if (PyFile_Check(py_file.ptr()))
        {
            FILE* handle = PyFile_AsFile(py_file.ptr());
            io::stream_buffer<io::file_descriptor_source> fpstream (fileno(handle), io::never_close_handle);
            std::istream in(&fpstream);
            return this->read_from_stream(in);
        }
        else
        {
            //
            // How do we implement this???
            //
            throw std::runtime_error("Not a file, have no idea how to read this!");
        }
    }
};


BOOST_PYTHON_MODULE(somelib)
{
    class_<SomeReaderWrap, boost::noncopyable>("SomeReader")
        .def("read", &SomeReaderWrap::read);
}
Run Code Online (Sandbox Code Playgroud)

有没有或多或少的通用方法将python IO对象转换为C++流?

先感谢您.


由于我的实验,我创建了一个小的github repo来说明这个问题.

Tan*_*ury 5

不要转换Python io.BytesIO对象,而应考虑实现能够从Python 对象读取的Boost.IOStreams Source概念模型io.BytesIO.这将允许人们构建一个boost::iostreams::stream并且可以使用SomeReader::read_from_stream().

教程演示如何创建和使用自定义Boost.IOStream源.总的来说,这个过程应该是相当直接的.只需要在以下方面实现Source概念的read()功能io.BufferedIOBase.read():

/// Type that implements the Boost.IOStream's Source concept for reading
/// data from a Python object supporting read(size).
class PythonInputDevice
  : public boost::iostreams::source // Use convenience class.
{
public:

  explicit
  PythonInputDevice(boost::python::object object)
    : object_(object)
  {}

  std::streamsize read(char_type* buffer, std::streamsize buffer_size) 
  {
    namespace python = boost::python;
    // Read data through the Python object's API.  The following is
    // is equivalent to:
    //   data = object_.read(buffer_size)
    boost::python::object py_data = object_.attr("read")(buffer_size);
    std::string data = python::extract<std::string>(py_data);

    // If the string is empty, then EOF has been reached.
    if (data.empty())
    {
      return -1; // Indicate end-of-sequence, per Source concept.
    }

    // Otherwise, copy data into the buffer.
    copy(data.begin(), data.end(), buffer);
    return data.size();
  }

private:
  boost::python::object object_;
};
Run Code Online (Sandbox Code Playgroud)

然后boost::iostreams::stream使用Source设备创建一个:

boost::iostreams::stream<PythonInputDevice> input(py_object);
SomeReader reader;
reader.read_from_stream(input);
Run Code Online (Sandbox Code Playgroud)

正如PythonInputDevice实现的那样object.read(),duck typing允许PythonInputDevice与支持read()具有相同前后条件的方法的任何Python对象一起使用.这包括内置的Python file对象,使得不再需要基于类型的条件分支SomeReaderWrap::read().


这是一个基于原始代码的完整最小示例:

#include <algorithm> // std::copy
#include <iosfwd> // std::streamsize
#include <iostream>
#include <boost/python.hpp>
#include <boost/iostreams/concepts.hpp>  // boost::iostreams::source
#include <boost/iostreams/stream.hpp>

class SomeReader
{
public:
  bool read_from_stream(std::istream& input)
  {
    std::string content(std::istreambuf_iterator<char>(input.rdbuf()),
                        (std::istreambuf_iterator<char>()));
    std::cout << "SomeReader::read_from_stream(): " << content << std::endl;
    return true;      
  }
};

/// Type that implements a model of the Boost.IOStream's Source concept
/// for reading data from a Python object supporting:
///   data = object.read(size).
class PythonInputDevice
  : public boost::iostreams::source // Use convenience class.
{
public:

  explicit
  PythonInputDevice(boost::python::object object)
    : object_(object)
  {}

  std::streamsize read(char_type* buffer, std::streamsize buffer_size) 
  {
    namespace python = boost::python;
    // Read data through the Python object's API.  The following is
    // is equivalent to:
    //   data = object_.read(buffer_size)
    boost::python::object py_data = object_.attr("read")(buffer_size);
    std::string data = python::extract<std::string>(py_data);

    // If the string is empty, then EOF has been reached.
    if (data.empty())
    {
      return -1; // Indicate end-of-sequence, per Source concept.
    }

    // Otherwise, copy data into the buffer.
    copy(data.begin(), data.end(), buffer);
    return data.size();
  }

private:
  boost::python::object object_;
};

struct SomeReaderWrap
  : SomeReader,
    boost::python::wrapper<SomeReader>
{
  bool read(boost::python::object& object)
  {
    boost::iostreams::stream<PythonInputDevice> input(object);
    return this->read_from_stream(input);
  }
};

BOOST_PYTHON_MODULE(example)
{
  namespace python = boost::python;
  python::class_<SomeReaderWrap, boost::noncopyable>("SomeReader")
    .def("read", &SomeReaderWrap::read)
    ;
}
Run Code Online (Sandbox Code Playgroud)

互动用法:

$ echo -n "test file" > test_file
$ python
>>> import example
>>> with open('test_file') as f:
...     reader = example.SomeReader()
...     reader.read(f)
... 
SomeReader::read_from_stream(): test file
True
>>> import io
>>> with io.BytesIO("Hello Stack Overflow") as f:
...     reaader = example.SomeReader()
...     reader.read(f)
... 
SomeReader::read_from_stream(): Hello Stack Overflow
True
Run Code Online (Sandbox Code Playgroud)