我有一系列包含混合数字的文本(即:整个部分和一个小数部分).问题是文本充满了人为编码的邋iness:
我需要一个可以解析这些元素的正则表达式,这样我就可以从这个混乱中创建一个正确的数字.
Cra*_*ker 11
这是一个正则表达式,它将处理我可以抛出的所有数据:
(\d++(?! */))? *-? *(?:(\d+) */ *(\d+))?.*$
Run Code Online (Sandbox Code Playgroud)
这会将数字放入以下组中:
另外,这里是RegexBuddy对元素的解释(在构建它时极大地帮助了我):
Match the regular expression below and capture its match into backreference number 1 «(\d++(?! */))?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single digit 0..9 «\d++»
Between one and unlimited times, as many times as possible, without giving back (possessive) «++»
Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?! */)»
Match the character “ ” literally « *»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “/” literally «/»
Match the character “ ” literally « *»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “-” literally «-?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the character “ ” literally « *»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the regular expression below «(?:(\d+) */ *(\d+))?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the regular expression below and capture its match into backreference number 2 «(\d+)»
Match a single digit 0..9 «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “ ” literally « *»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “/” literally «/»
Match the character “ ” literally « *»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the regular expression below and capture its match into backreference number 3 «(\d+)»
Match a single digit 0..9 «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Run Code Online (Sandbox Code Playgroud)