我正在使用我的C++类中的自动摘要系统,并且对我正在进行的ASCII比较之一有疑问.这是代码:
char ch;
string sentence;
pair<char, char> sentenceCheck;
int counter = 0;
while (!ifs2.eof())
{
ch = ifs2.get();
ch = tolower(ch);
if (ch == 13)
ch = ifs2.get();
if (ch != 10 && ch != '?' && ch != '!' && ch != '.')
sentence += ch;
sentenceCheck.first = sentenceCheck.second;
sentenceCheck.second = ch;
cout << sentenceCheck.first << "-" << (int)sentenceCheck.first << " ---- " << sentenceCheck.second << "-" << (int)sentenceCheck.second << endl;
if(sentenceCheck.second == ' ' || sentenceCheck.second == 10 || sentenceCheck.second == -1)
{
if(sentenceCheck.first == '?' || sentenceCheck.first == '!' || sentenceCheck.first == '.')
{
istringstream s(sentence);
while(s >> wordInSentence)
{
sentenceWordMap.insert(pair<string, int>(wordInSentence, 0));
}
//sentenceList.push_back(pair<string, int>(sentence, 0));
sentence.clear();
}
}
}
Run Code Online (Sandbox Code Playgroud)
这里做了什么(使用两个if语句)检查是否已经在待分析和稍后处理的文本中开始新的句子.条件工作,但只是因为我们发现我们必须检查-1.任何想法代表什么?
-1不代表ASCII中的任何内容.所有ASCII码都在[0,127]范围内.它甚至不能保证C++ -1是一个有效的值char.
问题是你没有检查返回值ifs2.get(),它返回可能在文件末尾的int(不是char!)-1.检查这个的正确方法是
int ch = ifs2.get();
if (!ifs2)
// break the loop
Run Code Online (Sandbox Code Playgroud)
因为不保证EOF值-1(实际上是std::char_traits<char>::eof()).
(顺便说一下,你不应该把ASCII码写成幻数; \n用于换行,\r用于回车.)