从C/C++代码中删除注释

Mik*_*ike 67 c c++ comments

有没有一种简单的方法可以从C/C++源文件中删除注释而无需进行任何预处理.(也就是说,我认为你可以使用gcc -E,但这会扩展宏.)我只想要删除注释的源代码,不应该改变任何其他内容.

编辑:

对现有工具的偏好.我不想用正则表达式自己写这个,我预见代码中有太多的惊喜.

Jos*_*Lee 97

在源文件上运行以下命令:

gcc -fpreprocessed -dD -E test.c
Run Code Online (Sandbox Code Playgroud)

感谢KennyTM寻找合适的旗帜.这是完整性的结果:

test.c的:

#define foo bar
foo foo foo
#ifdef foo
#undef foo
#define foo baz
#endif
foo foo
/* comments? comments. */
// c++ style comments
Run Code Online (Sandbox Code Playgroud)

gcc -fpreprocessed -dD -E test.c:

#define foo bar
foo foo foo
#ifdef foo
#undef foo
#define foo baz
#endif
foo foo
Run Code Online (Sandbox Code Playgroud)

  • 我在gcc选项中添加了-P来抑制在删除函数注释开始时有时会显示的奇怪行标记. (13认同)
  • 我认为Mike期待的结果是`#define foo bar \nfoo foo foo` (3认同)
  • @Pascal:运行`gcc -fpreprocessed -dM -E test.c`来获取`#define`-s,但它们不在原始位置. (3认同)
  • 我还需要添加-P以获得可用的输出. (2认同)

Jon*_*ler 15

这取决于你的评论是多么不正常.我有一个程序scc来删除C和C++注释.我也有一个测试文件,我尝试了GCC(MacOS X上的4.2.1)以及当前选择的答案中的选项 - 而且GCC似乎并没有在一些可怕的屠杀评论中做得很好.测试用例.

注意:这不是一个现实生活中的问题 - 人们不会写出如此可怕的代码.

考虑测试用例的(子集 - 总共135行中的36个):

/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
/\
\
\
\
* C comment */
Run Code Online (Sandbox Code Playgroud)

在我的Mac上,GCC(gcc -fpreprocessed -dD -E subset.c)的输出是:

/\
*\
Regular
comment
*\
/
The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.
/\
*/ This is a regular C comment *\
but this is just a routine continuation *\
and that was not the end either - but this is *\
\
/
The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
/\
\
\
\
* C comment */
Run Code Online (Sandbox Code Playgroud)

'scc'的输出是:

The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.
/\
\
\
/ But this is a C++/C99 comment!
The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.

The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
Run Code Online (Sandbox Code Playgroud)

'scc -C'(识别双斜杠注释)的输出是:

The regular C comment number 1 has finished.

/\
\/ This is not a C++/C99 comment!

This is followed by C++/C99 comment number 3.

The C++/C99 comment number 3 has finished.

/\
\* This is not a C or C++ comment!

This is followed by regular C comment number 2.

The regular C comment number 2 has finished.

This is followed by regular C comment number 3.
Run Code Online (Sandbox Code Playgroud)

SCC的源代码现已在GitHub上提供

SCC的当前版本是6.60(日期为2016-06-12),尽管Git版本是在2017-01-18(美国/太平洋时区)创建的.该代码可从GitHub获得,网址https://github.com/jleffler/scc-snapshots.您还可以找到以前版本(4.03,4.04,5.05)和两个预发行版(6.16,6.50)的快照 - 这些都已标记release/x.yz.

该代码仍主要在RCS下开发.我还在研究如何使用子模块或类似的机制来处理像stderr.c和的公共库文件stderr.h(也可以在https://github.com/jleffler/soq中找到).

SCC版本6.60试图理解C++ 11,C++ 14和C++ 17构造,如二进制常量,数字标点符号,原始字符串和十六进制浮点数.它默认为C11模式操作.(请注意,-C上面提到的标志的含义- 在答案主体和6.60版本中描述的版本4.0x之间翻转,这是当前最新版本.)

  • 相信我他们做的Jonathan.我清除了代码,有2000行代码被评论.我无法相信人类如何能够编写这个混乱的代码. (5认同)

lhf*_*lhf 7

gcc -fpreprocessed -dD -E对我不起作用,但是这个程序可以做到:

#include <stdio.h>

static void process(FILE *f)
{
 int c;
 while ( (c=getc(f)) != EOF )
 {
  if (c=='\'' || c=='"')            /* literal */
  {
   int q=c;
   do
   {
    putchar(c);
    if (c=='\\') putchar(getc(f));
    c=getc(f);
   } while (c!=q);
   putchar(c);
  }
  else if (c=='/')              /* opening comment ? */
  {
   c=getc(f);
   if (c!='*')                  /* no, recover */
   {
    putchar('/');
    ungetc(c,f);
   }
   else
   {
    int p;
    putchar(' ');               /* replace comment with space */
    do
    {
     p=c;
     c=getc(f);
    } while (c!='/' || p!='*');
   }
  }
  else
  {
   putchar(c);
  }
 }
}

int main(int argc, char *argv[])
{
 process(stdin);
 return 0;
}
Run Code Online (Sandbox Code Playgroud)

  • 不处理三字母. (4认同)

che*_*che 7

有一个stripcmt程序可以做到这一点:

StripCmt是一个用C编写的简单实用程序,用于从C,C++和Java源文件中删除注释.在Unix文本处理程序的传统中,它可以作为FIFO(先进先出)过滤器或在命令行上接受参数.

(根据hlovdal的回答:关于Python代码的问题)