C++中的字符串标记化,包括分隔符字符

Avi*_*ash 4 c++ stl

我有以下形式a = x + yabc = xyz + 56 + 5或的字符串 f(p)

我需要的是标记字符串,以便我读取每个字符串operator,operand 因此对于a = x + y令牌返回应该是,a,=,x,+,y并且如果abc=xyz+5它应该返回abc,=,xyz,+,5.请注意,operator和之间可能有空格,也可能没有空格operands

这是我试过的

void tokenize(std::vector<std::string>& tokens, const char* input, const char* delimiters) {
    const char* s = input;
    const char* e = s;
    while (*e != 0) {
        e = s;
        while (*e != 0 && strchr(delimiters, *e) == 0) {
            ++e;
        }
        if ( *e != ' ' && strchr(delimiters, *e) != 0 ){
            std::string op = "";
            op += *e;
            tokens.push_back(op);
        }
        if (e - s > 0) {
            tokens.push_back(std::string(s,e - s));
        }
        s = e + 1;
    }
}
Run Code Online (Sandbox Code Playgroud)

lin*_*llo 5

您可以使用此实现.第一个参数是要标记的std :: string,第二个参数是要使用的分隔符.它返回一个标记化的字符串向量.非常简单而有效.

vector<string> tokenizeString(const string& str, const string& delimiters)
{  
   vector<string> tokens;
   // Skip delimiters at beginning.
   string::size_type lastPos = str.find_first_not_of(delimiters, 0);
   // Find first "non-delimiter".
   string::size_type pos = str.find_first_of(delimiters, lastPos);

   while (string::npos != pos || string::npos != lastPos)
    {  // Found a token, add it to the vector.
      tokens.push_back(str.substr(lastPos, pos - lastPos));
      // Skip delimiters.  Note the "not_of"
      lastPos = str.find_first_not_of(delimiters, pos);
      // Find next "non-delimiter"
      pos = str.find_first_of(delimiters, lastPos);
   }
    return tokens;
}
Run Code Online (Sandbox Code Playgroud)