用C/C++解析二进制消息流

Question

用C/C++解析二进制消息流

gag*_*aga 6 c c++ binary parsing memory-alignment

我正在为二进制协议(Javad GRIL协议)编写解码器.它由大约一百条消息组成,数据格式如下:

struct MsgData {
    uint8_t num;
    float x, y, z;
    uint8_t elevation;
    ...
};

Run Code Online (Sandbox Code Playgroud)

这些字段是ANSI编码的二进制数,它们彼此之间没有间隙.解析此类消息的最简单方法是将输入的字节数组转换为适当的类型.问题是流中的数据是打包的,即未对齐的.

在x86上,这可以通过使用来解决#pragma pack(1).但是,这在某些其他平台上不起作用,或者由于未对齐数据而导致性能开销.

另一种方法是为每种消息类型编写一个特定的解析函数,但正如我所提到的,该协议包含数百条消息.

另一种选择是使用类似Perl unpack()函数的东西并在某处存储消息格式.说,我们可以#define MsgDataFormat "CfffC"再打电话unpack(pMsgBody, MsgDataFormat).这要短得多,但仍然容易出错并且多余.此外,格式可能更复杂,因为消息可以包含数组,因此解析器将是缓慢而复杂的.

有没有共同有效的解决方案？我已经阅读了这篇文章,并用Google搜索,但没有找到更好的方法来做到这一点.

也许C++有一个解决方案？

Answer 1

sbi*_*sbi 7

好的,下面用VC10和GCC 4.5.1编译(在ideone.com上).我认为C++ 1x的所有这些需求<tuple>都应该std::tr1::tuple在旧的编译器中可用(如).

它仍然需要您为每个成员键入一些代码,但这是非常小的代码.(最后见我的解释.)

#include <iostream>
#include <tuple>

typedef unsigned char uint8_t;
typedef unsigned char byte_t;

struct MsgData {
    uint8_t num;
    float x;
    uint8_t elevation;

    static const std::size_t buffer_size = sizeof(uint8_t)
                                         + sizeof(float) 
                                         + sizeof(uint8_t);

    std::tuple<uint8_t&,float&,uint8_t&> get_tied_tuple()
    {return std::tie(num, x, elevation);}
    std::tuple<const uint8_t&,const float&,const uint8_t&> get_tied_tuple() const
    {return std::tie(num, x, elevation);}
};

// needed only for test output
inline std::ostream& operator<<(std::ostream& os, const MsgData& msgData)
{
    os << '[' << static_cast<int>(msgData.num) << ' ' 
       << msgData.x << ' ' << static_cast<int>(msgData.elevation) << ']';
    return os;
}

namespace detail {

    // overload the following two for types that need special treatment
    template<typename T>
    const byte_t* read_value(const byte_t* bin, T& val)
    {
        val = *reinterpret_cast<const T*>(bin);
        return bin + sizeof(T)/sizeof(byte_t);
    }
    template<typename T>
    byte_t* write_value(byte_t* bin, const T& val)
    {
        *reinterpret_cast<T*>(bin) = val;
        return bin + sizeof(T)/sizeof(byte_t);
    }

    template< typename MsgTuple, unsigned int Size = std::tuple_size<MsgTuple>::value >
    struct msg_serializer;

    template< typename MsgTuple >
    struct msg_serializer<MsgTuple,0> {
        static const byte_t* read(const byte_t* bin, MsgTuple&) {return bin;}
        static byte_t* write(byte_t* bin, const MsgTuple&)      {return bin;}
    };

    template< typename MsgTuple, unsigned int Size >
    struct msg_serializer {
        static const byte_t* read(const byte_t* bin, MsgTuple& msg)
        {
            return read_value( msg_serializer<MsgTuple,Size-1>::read(bin, msg)
                             , std::get<Size-1>(msg) );
        }
        static byte_t* write(byte_t* bin, const MsgTuple& msg)
        {
            return write_value( msg_serializer<MsgTuple,Size-1>::write(bin, msg)
                              , std::get<Size-1>(msg) );
        }
    };

    template< class MsgTuple >
    inline const byte_t* do_read_msg(const byte_t* bin, MsgTuple msg)
    {
        return msg_serializer<MsgTuple>::read(bin, msg);
    }

    template< class MsgTuple >
    inline byte_t* do_write_msg(byte_t* bin, const MsgTuple& msg)
    {
        return msg_serializer<MsgTuple>::write(bin, msg);
    }
}

template< class Msg >
inline const byte_t* read_msg(const byte_t* bin, Msg& msg)
{
    return detail::do_read_msg(bin, msg.get_tied_tuple());
}

template< class Msg >
inline const byte_t* write_msg(byte_t* bin, const Msg& msg)
{
    return detail::do_write_msg(bin, msg.get_tied_tuple());
}

int main()
{
    byte_t buffer[MsgData::buffer_size];

    std::cout << "buffer size is " << MsgData::buffer_size << '\n';

    MsgData msgData;
    std::cout << "initializing data...";
    msgData.num = 42;
    msgData.x = 1.7f;
    msgData.elevation = 17;
    std::cout << "data is now " << msgData << '\n';
    write_msg(buffer, msgData);

    std::cout << "clearing data...";
    msgData = MsgData();
    std::cout << "data is now " << msgData << '\n';

    std::cout << "reading data...";
    read_msg(buffer, msgData);
    std::cout << "data is now " << msgData << '\n';

    return 0;
}

Run Code Online (Sandbox Code Playgroud)

对我来说这打印

buffer size is 6
initializing data...data is now [0x2a 1.7 0x11]
clearing data...data is now [0x0 0 0x0]
reading data...data is now [0x2a 1.7 0x11]

(我已将您的MsgData类型缩短为仅包含三个数据成员,但这仅用于测试.)

对于每种消息类型,您需要定义其buffer_size静态常量和两个get_tied_tuple()成员函数,一个const和一个非函数const,两者都以相同的方式实现.(当然,这些也可能是非成员,但我试图让它们靠近它们所依赖的数据成员列表.)
对于某些类型(如std::string),你需要添加那些detail::read_value()和detail::write_value()函数的特殊重载.
所有消息类型的其余机器保持不变.

有了完整的C++ 1x支持,您可以摆脱必须完全键入get_tied_tuple()成员函数的显式返回类型,但我实际上没有尝试过这个.

Answer 2

Nim*_*Nim 1

简单的答案是否定的，如果消息是无法简单转换的特定二进制格式，那么您别无选择，只能为其编写解析器。如果您有消息描述（例如 xml 或某种形式的易于解析的描述），为什么不根据该描述自动生成解析代码呢？它不会像强制转换那么快，但生成速度会比手动编写每条消息快得多......

归档时间：	15 年前
查看次数：	7848 次
最近记录：	10 年，6 月前