相关疑难解决方法(0)

如何将unicode字符与boost :: spirit匹配?

如何使用utf8 unicode字符boost::spirit

例如,我想识别此字符串中的所有字符:

$ echo "?? ?????? ????????? ????" | ./a.out
? ? ? ? ? ? ?? ? ? ? ? ? ? ? ? ? ? ? ? ?
Run Code Online (Sandbox Code Playgroud)

当我尝试这个简单的boost::spirit程序时,它将无法正确匹配unicode字符:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
#include <boost/foreach.hpp>
namespace qi = boost::spirit::qi;

int main() {
  std::cin.unsetf(std::ios::skipws);
  boost::spirit::istream_iterator begin(std::cin);
  boost::spirit::istream_iterator end;

  std::vector<char> letters;
  bool result = qi::phrase_parse(
      begin, end,  // input     
      +qi::char_,  // match every character
      qi::space,   // skip whitespace 
      letters);    // result    

  BOOST_FOREACH(char letter, letters) { …
Run Code Online (Sandbox Code Playgroud)

c++ parsing boost boost-spirit

9
推荐指数
1
解决办法
3843
查看次数

如何使用boost :: spirit来解析UTF-8?

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

#define BOOST_SPIRIT_UNICODE // We'll use unicode (UTF8) all throughout

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_parse.hpp>
#include <boost/spirit/include/support_standard_wide.hpp>

void parse_simple_string()
{
    namespace qi = boost::spirit::qi;    
    namespace encoding  = boost::spirit::unicode;
    //namespace stw = boost::spirit::standard_wide;

    typedef std::wstring::const_iterator iterator_type;

    std::vector<std::wstring> result;
    std::wstring const input = LR"(12,3","ab,cd","G,G\"GG","kkk","10,\"0","99987","PPP","??)";

    qi::rule<iterator_type, std::wstring()> key = +(qi::unicode::char_ - qi::lit(L"\",\""));
    qi::phrase_parse(input.begin(), input.end(),
                     key % qi::lit(L"\",\""),
                     encoding::space,
                     result);

    //std::copy(result.rbegin(), result.rend(), std::ostream_iterator<std::wstring, wchar_t>  (std::wcout, L"\n"));
    for(auto const &data : result) std::wcout<<data<<std::endl;
}
Run Code Online (Sandbox Code Playgroud)

我研究了这篇文章如何使用Boost Spirit来解析中文(unicode utf-16)? 并按照指南,但无法解析"你好" …

c++ unicode boost utf-8 boost-spirit

5
推荐指数
1
解决办法
3546
查看次数

标签 统计

boost ×2

boost-spirit ×2

c++ ×2

parsing ×1

unicode ×1

utf-8 ×1