RegEx提取前6到10位数字,不包括8位数字

Question

RegEx提取前6到10位数字,不包括8位数字

我有以下测试文件名:

abc001_20111104_summary_123.txt
abc008_200700953_timeline.txt
abc008_20080402_summary200201573unitf.txt
123456.txt
100101-100102 test.txt
abc008_20110902_summary200110254.txt
abcd 200601141 summary.txt
abc008_summary_200502169_xyz.txt

Run Code Online (Sandbox Code Playgroud)

我需要从每个文件名中提取一个数字.

数字长度必须为6,7,9或10位(因此,不包括8位数字).

我想得到第一个数字,如果找到多个,或者如果没有找到空字符串.

我设法通过两个步骤完成此操作,首先删除8位数字,然后从列表中提取6到10位数字.

step 1 
  regex:  ([^0-9])([0-9]{8})([^0-9])
  replacement:  \1\3

step 2
  regex: (.*?)([1-9]([0-9]{5,6}|[0-9]{8,9}))([^0-9].*)
  replacement:  \2

Run Code Online (Sandbox Code Playgroud)

我在这两个步骤之后获得的数字正是我正在寻找的:

[]
[200700953]
[200201573]
[123456]
[100101]
[200110254]
[200601141]
[200502169]

Run Code Online (Sandbox Code Playgroud)

现在,问题是: 有没有办法在一步过程中做到这一点？

我已经看到了类似问题的这个很好的解决方案,但是,如果发现不止一个,它会给我最新的数字.

注意:使用Regex Coach进行测试.

Answer 1

Tim*_*ker 7

假设你的正则表达式引擎支持lookbehind断言:

(?<!\d)\d{6}(?:\d?|\d{3,4})(?!\d)

Run Code Online (Sandbox Code Playgroud)

说明:

(?<!\d)   # Assert that the previous character (if any) isn't a digit
\d{6}     # Match 6 digits
(?:       # Either match
 \d?      # 0 or 1 digits
|         # or
 \d{3,4}  # 3 or 4 digits
)         # End of alternation
(?!\d)    # Assert that the next character (if any) isn't a digit

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，7 月前
查看次数：	3856 次
最近记录：	7 年，10 月前