sco*_*ttm 4 c++ regex iterator greedy boost-regex
我试图找出字符串中有多少正则表达式匹配.我正在使用迭代器迭代匹配,并使用整数来记录有多少.
long int before = GetTickCount();
string text;
boost::regex re("^(\\d{5})\\s(\\d{8})\\s(.*)\\s(.*)\\s(.*)\\s(\\d{8})\\s(.{1})$");
char * buffer;
long length;
long count;
ifstream f;
f.open("c:\\temp\\test.txt", ios::in | ios::ate);
length = f.tellg();
f.seekg(0, ios::beg);
buffer = new char[length];
f.read(buffer, length);
f.close();
text = buffer;
boost::sregex_token_iterator itr(text.begin(), text.end(), re, 0);
boost::sregex_token_iterator end;
count = 0;
for(; itr != end; ++itr)
{
    count++;
}
long int after = GetTickCount();
cout << "Found " << count << " matches in " << (after-before) << " ms." << endl;
在我的例子中,count总是返回1,即使我把代码放在for循环中以显示匹配(并且有很多).这是为什么?我究竟做错了什么?
测试输入:
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
输出(不匹配):
在16毫秒内找到1场比赛.
如果我将for循环更改为:
count = 0;
for(; itr != end; ++itr)
{
    string match(itr->first, itr->second);
    cout << match << endl;
    count++;
}
我得到这个作为输出:
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
12345 12345678 SOME NAME SOMETHING 88888888 N
Found 1 matches in 47 ms.
cha*_*aos 12
嘿.你的问题是你的正则表达式.将您的(.\*)s 更改为(.\*?)s(假设支持).你认为你看到每一行都匹配,但实际上你看到整个文本是匹配的,因为你的模式是贪婪的.
要查看说明的问题,请将循环中的调试输出更改为:
cout << "[" << match << "]" << endl;