C++逐字读取文件,不带任何符号

xmz*_*xmz 2 c++ string parsing lexical-analysis ifstream

我想从文本文件中逐字读取。这是我的 C++ 代码:

int main(int argc, const char * argv[]) {
    // insert code here...

    ifstream file("./wordCount.txt");
    string word;
    while(file >> word){
        cout<<word<<endl;
    }

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

该文本文件包含以下句子:

I don't have power, but he has power.
Run Code Online (Sandbox Code Playgroud)

这是我得到的结果:

I
don\241\257t
have
power,
but
he
has
power.
Run Code Online (Sandbox Code Playgroud)

你能告诉我如何获得如下格式的结果:

I
don't
have
power
but
he
has
power
Run Code Online (Sandbox Code Playgroud)

谢谢。

Chr*_*phe 5

我知道您正在寻求摆脱标点符号。

不幸的是,从流中提取字符串仅查找空格作为分隔符。因此,“don't”或“Hello,world”将被读作一个单词,而“don't”或“Hello,world”将被读作两个单词。

另一种方法是逐行读取文本,并使用string::find_first_of()从分隔符跳转到分隔符:

string separator{" \t\r\n,.!?;:"};
string line; 
string word;
while(getline (cin, line)){  // read line by line 
    size_t e,s=0;            // s = offset of next word, e = end of next word 
    do {
        s = line.find_first_not_of(separator,s);  // skip leading separators
        if (s==string::npos)                  // stop if no word left
            break;
        e=line.find_first_of(separator, s);   // find next separator 
        string word(line.substr(s,e-s));      // construct the word
        cout<<word<<endl;
        s=e+1;                                // position after the separator
    } while (e!=string::npos);                // loop if end of line not reached
}
Run Code Online (Sandbox Code Playgroud)

在线演示