Ind*_*ant 7 c++ regex boost boost-regex
#include <boost/regex.hpp>
#include <string>
#include <iostream>
using namespace boost;
static const regex regexp(
"std::vector<"
"(std::map<"
"(std::pair<((\\w+)(::)?)+, (\\w+)>,?)+"
">,?)+"
">");
std::string errorMsg =
"std::vector<"
"std::map<"
"std::pair<Test::Test, int>,"
"std::pair<Test::Test, int>,"
"std::pair<Test::Test, int>"
">,"
"std::map<"
"std::pair<Test::Test, int>,"
"std::pair<Test::Test, int>,"
"std::pair<Test::Test, int>"
">"
">";
int main()
{
smatch result;
if(regex_match(errorMsg, result, regexp))
{
for (unsigned i = 0; i < result.size(); ++i)
{
std::cout << result[i] << std::endl;
}
}
// std::cout << errorMsg << std::endl;
return 0;
}
Run Code Online (Sandbox Code Playgroud)
这会产生:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::runtime_error>
>' what(): Ran out of stack space trying to match the regular expression.
Run Code Online (Sandbox Code Playgroud)
用.编译
g++ regex.cc -lboost_regex
Run Code Online (Sandbox Code Playgroud)
编辑
我的平台:
g++ (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5
libboost-regex1.42
Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
So the latest Ubuntu 64 bit
Run Code Online (Sandbox Code Playgroud)
Bil*_*eal 13
((\\w+)(::)?)+
它是所谓的"病态"正则表达式之一 - 它将采用指数时间,因为你有两个表达式,它们彼此依赖于彼此.也就是说,它由于灾难性的回溯而失败.
考虑我们是否遵循链接的示例,并将"更复杂的东西"减少为"x".让我们这样做\\w
:
((x+)(::)?)+
我们也假设我们的输入永远不会有::
.这实际上使正则表达式更复杂,所以如果我们抛弃复杂性,那么我们真的应该让事情更简单,如果没有别的:
(x+)+
现在你有一个教科书嵌套量词问题,如上面链接中详述的那样.
有几种方法可以解决这个问题,但最简单的方法可能就是禁止使用原子组修饰符 " (?>
" 对内部匹配进行回溯:
((?>\\w+)(::)?)+