将单行JavaScript注释(//)与re匹配

Att*_* O. 6 javascript python regex replace

我想使用python的re模块从(大多数是有效的)JavaScript过滤掉(主要是单行)注释.例如:

// this is a comment
var x = 2 // and this is a comment too
var url = "http://www.google.com/" // and "this" too
url += 'but // this is not a comment' // however this one is
url += 'this "is not a comment' + " and ' neither is this " // only this
Run Code Online (Sandbox Code Playgroud)

我现在正在尝试这个超过半个小时而没有任何成功.谁能帮帮我吗?

编辑1:

foo = 'http://stackoverflow.com/' // these // are // comments // too //
Run Code Online (Sandbox Code Playgroud)

编辑2:

bar = 'http://no.comments.com/'
Run Code Online (Sandbox Code Playgroud)

dri*_*iax 7

我的正则表达能力有点陈旧,所以我用你的问题来解读我记得的东西.它变成了一个相当大的正则表达式,主要是因为我也想过滤多行注释.

import re

reexpr = r"""
    (                           # Capture code
        "(?:\\.|[^"\\])*"       # String literal
        |
        '(?:\\.|[^'\\])*'       # String literal
        |
        (?:[^/\n"']|/[^/*\n"'])+ # Any code besides newlines or string literals
        |
        \n                      # Newline
    )|
    (/\*  (?:[^*]|\*[^/])*   \*/)        # Multi-line comment
    |
    (?://(.*)$)                 # Comment
    $"""
rx = re.compile(reexpr, re.VERBOSE + re.MULTILINE)
Run Code Online (Sandbox Code Playgroud)

此正则表达式与三个不同的子组匹配.一个用于代码,两个用于评论内容.以下是如何提取这些内容的示例.

code = r"""// this is a comment
var x = 2 * 4 // and this is a comment too
var url = "http://www.google.com/" // and "this" too
url += 'but // this is not a comment' // however this one is
url += 'this "is not a comment' + " and ' neither is this " // only this

bar = 'http://no.comments.com/' // these // are // comments
bar = 'text // string \' no // more //\\' // comments
bar = 'http://no.comments.com/'
bar = /var/ // comment

/* comment 1 */
bar = open() /* comment 2 */
bar = open() /* comment 2b */// another comment
bar = open( /* comment 3 */ file) // another comment 
"""

parts = rx.findall(code)
print '*' * 80, '\nCode:\n\n', '\n'.join([x[0] for x in parts if x[0].strip()])
print '*' * 80, '\nMulti line comments:\n\n', '\n'.join([x[1] for x in parts if x[1].strip()])
print '*' * 80, '\nOne line comments:\n\n', '\n'.join([x[2] for x in parts if x[2].strip()])
Run Code Online (Sandbox Code Playgroud)