正则表达式匹配草率分数/混合数字

Cra*_*ker 13 regex

我有一系列包含混合数字的文本(即:整个部分和一个小数部分).问题是文本充满了人为编码的邋iness:

  1. 整个部分可能存在也可能不存在(例如:"10")
  2. 小数部分可能存在也可能不存在(例如:"1/3")
  3. 这两个部分可以用空格和/或连字符分开(例如:"10 1/3","10-1/3","10 - 1/3").
  4. 分数本身在数字和斜线之间可以有或没有空格(例如:"1/3","1/3","1/3").
  5. 在需要忽略的分数之后可能还有其他文本

我需要一个可以解析这些元素的正则表达式,这样我就可以从这个混乱中创建一个正确的数字.

Cra*_*ker 11

这是一个正则表达式,它将处理我可以抛出的所有数据:

(\d++(?! */))? *-? *(?:(\d+) */ *(\d+))?.*$
Run Code Online (Sandbox Code Playgroud)

这会将数字放入以下组中:

  1. 混合数的整个部分,如果存在的话
  2. 分数,如果分数退出
  3. 分母,如果存在分数

另外,这里是RegexBuddy对元素的解释(在构建它时极大地帮助了我):

Match the regular expression below and capture its match into backreference number 1 «(\d++(?! */))?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match a single digit 0..9 «\d++»
      Between one and unlimited times, as many times as possible, without giving back (possessive) «++»
   Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?! */)»
      Match the character “ ” literally « *»
         Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
      Match the character “/” literally «/»
Match the character “ ” literally « *»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “-” literally «-?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the character “ ” literally « *»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the regular expression below «(?:(\d+) */ *(\d+))?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match the regular expression below and capture its match into backreference number 2 «(\d+)»
      Match a single digit 0..9 «\d+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match the character “ ” literally « *»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match the character “/” literally «/»
   Match the character “ ” literally « *»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match the regular expression below and capture its match into backreference number 3 «(\d+)»
      Match a single digit 0..9 «\d+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match any single character that is not a line break character «.*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Run Code Online (Sandbox Code Playgroud)