在Boost Spirit中解析嵌套的键值对

man*_*ter 5 c++ boost boost-spirit

我无法使用Boost :: Spirit编写我认为应该是简单解析器的内容.(我使用的是Spirit而不仅仅是使用字符串函数,因为这对我来说是一个学习练习).

数据

要解析的数据采用键值对的形式,其中值本身可以是键值对.键是字母数字(带下划线,没有数字作为第一个字符); 值是字母数字加.-_- 值可以是格式的日期,DD-MMM-YYYY例如01-Jan-2015浮点数,例如3.1415普通的旧字母数字字符串.键和值分开=; 对分开;; 结构化值用{... 分隔}.目前我正在从用户输入中删除所有空格,然后将其传递给Spirit.

输入示例:

Key1 = Value1; Key2 = { NestedKey1=Alan; NestedKey2 = 43.1232; }; Key3 = 15-Jul-1974 ;

然后我会删除所有空格

Key1=Value1;Key2={NestedKey1=Alan;NestedKey2=43.1232;};Key3=15-Jul-1974;

然后我实际将它传递给Spirit.

问题

当价值只是价值时,我现在所拥有的只是花花公子.当我开始在输入中编码结构化值时,Spirit会在第一个结构化值之后停止.如果只有一个结构化值,则解决方法是将其放在输入的末尾...但有时我需要两个或更多结构化值.

代码

以下编译VS2013并说明错误:

#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/pair.hpp>
#include <boost/fusion/adapted.hpp>
#include <map>
#include <string>
#include <iostream>

typedef std::map<std::string, std::string> ARGTYPE;

#define BOOST_SPIRIT_DEBUG

namespace qi = boost::spirit::qi;
namespace fusion = boost::fusion;

template < typename It, typename Skipper>
struct NestedGrammar : qi::grammar < It, ARGTYPE(), Skipper >
{
    NestedGrammar() : NestedGrammar::base_type(Sequence)
    {
        using namespace qi;
        KeyName = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z0-9_");
        Value = +qi::char_("-.a-zA-Z_0-9");

        Pair = KeyName >> -(
            '=' >> ('{' >> raw[Sequence] >> '}' | Value)
            );

        Sequence = Pair >> *((qi::lit(';') | '&') >> Pair);

        BOOST_SPIRIT_DEBUG_NODE(KeyName);
        BOOST_SPIRIT_DEBUG_NODE(Value);
        BOOST_SPIRIT_DEBUG_NODE(Pair);
        BOOST_SPIRIT_DEBUG_NODE(Sequence);
    }
private:
    qi::rule<It, ARGTYPE(), Skipper> Sequence;
    qi::rule<It, std::string()> KeyName;
    qi::rule<It, std::string(), Skipper> Value;
    qi::rule<It, std::pair < std::string, std::string>(), Skipper> Pair;
};


template <typename Iterator>
ARGTYPE Parse2(Iterator begin, Iterator end)
{
    NestedGrammar<Iterator, qi::space_type> p;
    ARGTYPE data;
    qi::phrase_parse(begin, end, p, qi::space, data);
    return data;
}


// ARGTYPE is std::map<std::string,std::string>
void NestedParse(std::string Input, ARGTYPE& Output)
{
    Input.erase(std::remove_if(Input.begin(), Input.end(), isspace), Input.end());
    Output = Parse2(Input.begin(), Input.end());
}

int main(int argc, char** argv)
{
    std::string Example1, Example2, Example3;
    ARGTYPE Out;

    Example1 = "Key1=Value1 ; Key2 = 01-Jan-2015; Key3 = 2.7181; Key4 = Johnny";
    Example2 = "Key1 = Value1; Key2 = {InnerK1 = one; IK2 = 11-Nov-2011;};";
    Example3 = "K1 = V1; K2 = {IK1=IV1; IK2=IV2;}; K3=V3; K4 = {JK1=JV1; JK2=JV2;};";

    NestedParse(Example1, Out);
    for (ARGTYPE::iterator i = Out.begin(); i != Out.end(); i++)
        std::cout << i->first << "|" << i->second << std::endl;
    std::cout << "=====" << std::endl;

    /* get the following, as expected:
    Key1|Value1
    Key2|01-Jan-2015
    Key3|2.7181
    Key4|Johnny
    */

    NestedParse(Example2, Out);
    for (ARGTYPE::iterator i = Out.begin(); i != Out.end(); i++)
        std::cout << i->first << "|" << i->second << std::endl;
    std::cout << "=====" << std::endl;

    /* get the following, as expected:
    Key1|Value1
    key2|InnerK1=one;IK2=11-Nov-2011
    */

    NestedParse(Example3, Out);
    for (ARGTYPE::iterator i = Out.begin(); i != Out.end(); i++)
        std::cout << i->first << "|" << i->second << std::endl;

    /* Only get the first two lines of the expected output:
    K1|V1
    K2|IK1=IV1;IK2=IV2
    K3|V3
    K4|JK1=JV1;JK2=JV2
    */

    return 0;

}
Run Code Online (Sandbox Code Playgroud)

我不确定问题是由于我对BNF的无知,我对精神的无知,还是在此时我对两者的无知.

任何帮助赞赏.我已经阅读了例如Spirit Qi序列解析问题和其中的链接,但我仍然无法弄清楚我做错了什么.

seh*_*ehe 2

事实上,这正是 Spirit 所擅长的简单语法。

\n\n

此外,绝对没有必要预先跳过空白:Spirit 已为此目的内置了船长。

\n\n

不过,对于你明确的问题:

\n\n

规则Sequence过于复杂。您可以只使用列表运算符 ( %):

\n\n
Sequence = Pair % char_(";&");\n
Run Code Online (Sandbox Code Playgroud)\n\n

;现在你的问题是你以不符合预期的a 结束序列,因此最终解析失败SequenceValue这不是很清楚,除非您#define BOOST_SPIRIT_DEBUG\xc2\xb9 并检查调试输出。

\n\n

因此,要修复它,请使用:

\n\n
Sequence = Pair % char_(";&") >> -omit[char_(";&")];\n
Run Code Online (Sandbox Code Playgroud)\n\n

Fix Live On Coliru(或带有调试信息

\n\n

印刷:

\n\n
Key1|Value1\nKey2|01-Jan-2015\nKey3|2.7181\nKey4|Johnny\n=====\nKey1|Value1\nKey2|InnerK1=one;IK2=11-Nov-2011;\n=====\nK1|V1\nK2|IK1=IV1;IK2=IV2;\nK3|V3\nK4|JK1=JV1;JK2=JV2;\n
Run Code Online (Sandbox Code Playgroud)\n\n
\n\n

奖金清理

\n\n

其实,这很简单。只需删除多余的行并删除空格即可。船长已经qi::space

\n\n

(请注意,尽管船长不适用于您的Value规则,因此值不能包含空格,但解析器也不会默默地跳过它;我想这可能是您想要的。请注意这一点)。

\n\n

递归AST

\n\n

你实际上想要一个递归 AST,而不是解析成平面地图。

\n\n

Boost递归变体使这变得轻而易举:

\n\n
namespace ast {\n    typedef boost::make_recursive_variant<std::string, std::map<std::string, boost::recursive_variant_> >::type Value;\n    typedef std::map<std::string, Value> Sequence;\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

要实现此功能,您只需更改规则的声明属性类型:

\n\n
qi::rule<It, ast::Sequence(),                      Skipper> Sequence;\nqi::rule<It, std::pair<std::string, ast::Value>(), Skipper> Pair;\nqi::rule<It, std::string(),                        Skipper> String;\nqi::rule<It, std::string()>                                 KeyName;\n
Run Code Online (Sandbox Code Playgroud)\n\n

规则本身甚至根本不需要改变。您需要编写一个小访问者来传输 AST:

\n\n
static inline std::ostream& operator<<(std::ostream& os, ast::Value const& value) {\n    struct vis : boost::static_visitor<> {\n        vis(std::ostream& os, std::string indent = "") : _os(os), _indent(indent) {}\n\n        void operator()(std::map<std::string, ast::Value> const& map) const {\n            _os << "map {\\n";\n            for (auto& entry : map) {\n                _os << _indent << "    " << entry.first << \'|\';\n                boost::apply_visitor(vis(_os, _indent+"    "), entry.second);\n                _os << "\\n";\n            }\n            _os << _indent << "}\\n";\n        }\n        void operator()(std::string const& s) const {\n            _os << s;\n        }\n\n    private:\n        std::ostream& _os;\n        std::string _indent;\n    };\n    boost::apply_visitor(vis(os), value);\n    return os;\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

现在它打印:

\n\n
map {\n    Key1|Value1\n    Key2|01-Jan-2015\n    Key3|2.7181\n    Key4|Johnny\n}\n\n=====\nmap {\n    Key1|Value1\n    Key2|InnerK1 = one; IK2 = 11-Nov-2011;\n}\n\n=====\nmap {\n    K1|V1\n    K2|IK1=IV1; IK2=IV2;\n    K3|V3\n    K4|JK1=JV1; JK2=JV2;\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

当然,最关键的是当你改为raw[Sequence]刚才 Sequence时候:

\n\n
map {\n    Key1|Value1\n    Key2|01-Jan-2015\n    Key3|2.7181\n    Key4|Johnny\n}\n\n=====\nmap {\n    Key1|Value1\n    Key2|map {\n        IK2|11-Nov-2011\n        InnerK1|one\n    }\n\n}\n\n=====\nmap {\n    K1|V1\n    K2|map {\n        IK1|IV1\n        IK2|IV2\n    }\n\n    K3|V3\n    K4|map {\n        JK1|JV1\n        JK2|JV2\n    }\n\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

完整的演示代码

\n\n

Live On Coliru

\n\n
//#define BOOST_SPIRIT_DEBUG\n#include <boost/variant.hpp>\n#include <boost/spirit/include/qi.hpp>\n#include <boost/fusion/adapted/std_pair.hpp>\n#include <iostream>\n#include <string>\n#include <map>\n\nnamespace ast {\n    typedef boost::make_recursive_variant<std::string, std::map<std::string, boost::recursive_variant_> >::type Value;\n    typedef std::map<std::string, Value> Sequence;\n}\n\nnamespace qi = boost::spirit::qi;\n\ntemplate <typename It, typename Skipper>\nstruct NestedGrammar : qi::grammar <It, ast::Sequence(), Skipper>\n{\n    NestedGrammar() : NestedGrammar::base_type(Sequence)\n    {\n        using namespace qi;\n        KeyName = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z0-9_");\n        String = +qi::char_("-.a-zA-Z_0-9");\n\n        Pair = KeyName >> -(\n                \'=\' >> (\'{\' >> Sequence >> \'}\' | String)\n            );\n\n        Sequence = Pair % char_(";&") >> -omit[char_(";&")];\n\n        BOOST_SPIRIT_DEBUG_NODES((KeyName) (String) (Pair) (Sequence))\n    }\nprivate:\n    qi::rule<It, ast::Sequence(),                      Skipper> Sequence;\n    qi::rule<It, std::pair<std::string, ast::Value>(), Skipper> Pair;\n    qi::rule<It, std::string(),                        Skipper> String;\n    qi::rule<It, std::string()>                                 KeyName;\n};\n\n\ntemplate <typename Iterator>\nast::Sequence DoParse(Iterator begin, Iterator end)\n{\n    NestedGrammar<Iterator, qi::space_type> p;\n    ast::Sequence data;\n    qi::phrase_parse(begin, end, p, qi::space, data);\n    return data;\n}\n\nstatic inline std::ostream& operator<<(std::ostream& os, ast::Value const& value) {\n    struct vis : boost::static_visitor<> {\n        vis(std::ostream& os, std::string indent = "") : _os(os), _indent(indent) {}\n\n        void operator()(std::map<std::string, ast::Value> const& map) const {\n            _os << "map {\\n";\n            for (auto& entry : map) {\n                _os << _indent << "    " << entry.first << \'|\';\n                boost::apply_visitor(vis(_os, _indent+"    "), entry.second);\n                _os << "\\n";\n            }\n            _os << _indent << "}\\n";\n        }\n        void operator()(std::string const& s) const {\n            _os << s;\n        }\n\n      private:\n        std::ostream& _os;\n        std::string _indent;\n    };\n    boost::apply_visitor(vis(os), value);\n    return os;\n}\n\nint main()\n{\n    std::string const Example1 = "Key1=Value1 ; Key2 = 01-Jan-2015; Key3 = 2.7181; Key4 = Johnny";\n    std::string const Example2 = "Key1 = Value1; Key2 = {InnerK1 = one; IK2 = 11-Nov-2011;};";\n    std::string const Example3 = "K1 = V1; K2 = {IK1=IV1; IK2=IV2;}; K3=V3; K4 = {JK1=JV1; JK2=JV2;};";\n\n    std::cout << DoParse(Example1.begin(), Example1.end()) << "\\n";\n    std::cout << DoParse(Example2.begin(), Example2.end()) << "\\n";\n    std::cout << DoParse(Example3.begin(), Example3.end()) << "\\n";\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n
\n\n

\xc2\xb9 你“拥有”它,但不在正确的地方!它应该出现在任何 Boost 包含之前。

\n