wub*_*dub 22 javascript ruby regex
我正在寻找一种方法,无论是在Ruby还是Javascript中,它都会在字符串中为正则表达式提供所有匹配,可能重叠.
假设我有str = "abcadc"
,我希望找到a
后跟任意数量字符的事件,然后是c
.我正在寻找的结果是["abc", "adc", "abcadc"]
.有关如何实现这一目标的任何想法?
str.scan(/a.*c/)
会给我的["abcadc"]
,str.scan(/(?=(a.*c))/).flatten
会给我的["abcadc", "adc"]
.
ndn*_*kov 11
def matching_substrings(string, regex)
string.size.times.each_with_object([]) do |start_index, maching_substrings|
start_index.upto(string.size.pred) do |end_index|
substring = string[start_index..end_index]
maching_substrings.push(substring) if substring =~ /^#{regex}$/
end
end
end
matching_substrings('abcadc', /a.*c/) # => ["abc", "abcadc", "adc"]
matching_substrings('foobarfoo', /(\w+).*\1/)
# => ["foobarf",
# "foobarfo",
# "foobarfoo",
# "oo",
# "oobarfo",
# "oobarfoo",
# "obarfo",
# "obarfoo",
# "oo"]
matching_substrings('why is this downvoted?', /why.*/)
# => ["why",
# "why ",
# "why i",
# "why is",
# "why is ",
# "why is t",
# "why is th",
# "why is thi",
# "why is this",
# "why is this ",
# "why is this d",
# "why is this do",
# "why is this dow",
# "why is this down",
# "why is this downv",
# "why is this downvo",
# "why is this downvot",
# "why is this downvote",
# "why is this downvoted",
# "why is this downvoted?"]
Run Code Online (Sandbox Code Playgroud)
aef*_*aef 11
在Ruby中,您可以使用以下方法获得预期结果:
str = "abcadc"
[/(a[^c]*c)/, /(a.*c)/].flat_map{ |pattern| str.scan(pattern) }.reduce(:+)
# => ["abc", "adc", "abcadc"]
Run Code Online (Sandbox Code Playgroud)
这种方式是否适合您,在很大程度上取决于您真正想要实现的目标.
我试着把它放到一个单独的表达式中,但我无法使它工作.我真的想知道是否有一些科学原因,这不能被正则表达式解析,或者我只是不太了解Ruby的解析器Oniguruma来做到这一点.
在JS中:
function doit(r, s) {
var res = [], cur;
r = RegExp('^(?:' + r.source + ')$', r.toString().replace(/^[\s\S]*\/(\w*)$/, '$1'));
r.global = false;
for (var q = 0; q < s.length; ++q)
for (var w = q; w <= s.length; ++w)
if (r.test(cur = s.substring(q, w)))
res.push(cur);
return res;
}
document.body.innerHTML += "<pre>" + JSON.stringify(doit( /a.*c/g, 'abcadc' ), 0, 4) + "</pre>";
Run Code Online (Sandbox Code Playgroud)
您想要所有可能的匹配,包括重叠匹配.正如您所指出的那样," 如何找到与正则表达式重叠匹配? " 的前瞻技巧对您的情况不起作用.
在一般情况下,我唯一能想到的就是生成字符串的所有可能的子字符串,并根据正则表达式的锚定版本检查每个字符串.这是蛮力,但它的确有效.
红宝石:
def all_matches(str, regex)
(n = str.length).times.reduce([]) do |subs, i|
subs += [*i..n].map { |j| str[i,j-i] }
end.uniq.grep /^#{regex}$/
end
all_matches("abcadc", /a.*c/)
#=> ["abc", "abcadc", "adc"]
Run Code Online (Sandbox Code Playgroud)
使用Javascript:
function allMatches(str, regex) {
var i, j, len = str.length, subs={};
var anchored = new RegExp('^' + regex.source + '$');
for (i=0; i<len; ++i) {
for (j=i; j<=len; ++j) {
subs[str.slice(i,j)] = true;
}
}
return Object.keys(subs).filter(function(s) { return s.match(anchored); });
}
Run Code Online (Sandbox Code Playgroud)
? str = "abcadc"
? from = str.split(/(?=\p{L})/).map.with_index { |c, i| i if c == 'a' }.compact
? to = str.split(/(?=\p{L})/).map.with_index { |c, i| i if c == 'c' }.compact
? from.product(to).select { |f,t| f < t }.map { |f,t| str[f..t] }
#? [
# [0] "abc",
# [1] "abcadc",
# [2] "adc"
# ]
Run Code Online (Sandbox Code Playgroud)
我相信,有一种奇特的方法来查找字符串中字符的所有索引,但我无法找到它:(任何想法?
拆分"unicode char boundary"使其能够使用'a?bc?'
或等字符串'U?ve Østergaard'
.
对于更通用的解决方案,它接受任何"from"和"to"序列,应该只引入一点修改:在字符串中查找"from"和"to"的所有索引.
这是一种类似于@ndn和@ Mark的方法,适用于任何字符串和正则表达式.我已经实现了这个方法,String
因为我希望看到它.这不是一个伟大的赞美String#[]
和String#scan
?
class String
def all_matches(regex)
return [] if empty?
r = /^#{regex}$/
1.upto(size).with_object([]) { |i,a|
a.concat(each_char.each_cons(i).map(&:join).select { |s| s =~ r }) }
end
end
'abcadc'.all_matches /a.*c/
# => ["abc", "abcadc", "adc"]
'aaabaaa'.all_matches(/a.*a/)
#=> ["aa", "aa", "aa", "aa", "aaa", "aba", "aaa", "aaba", "abaa", "aaaba",
# "aabaa", "abaaa", "aaabaa", "aabaaa", "aaabaaa"]
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
1482 次 |
最近记录: |