查找字符串中子字符串的所有索引

Question

查找字符串中子字符串的所有索引

我希望能够使用Ruby在更大的字符串中找到所有出现的子字符串的索引.例如:"爱因斯坦"中的所有"in"

str = "Einstein"
str.index("in") #returns only 1
str.scan("in")  #returns ["in","in"]
#desired output would be [1, 6]

Run Code Online (Sandbox Code Playgroud)

Answer 1

tok*_*and 11

标准的黑客是:

"Einstein".enum_for(:scan, /(?=in)/).map { Regexp.last_match.offset(0).first }
#=> [1, 6]

Run Code Online (Sandbox Code Playgroud)

不错的，托克。注意`"nnnn".enum_for(:scan, /nn/).map { Regexp.last_match.offset(0).first } #=> [0, 2]`。如果 `[0, 1, 2]` 是所需的返回值，请将正则表达式 (`/nn/`) 更改为 `/(?=nn)/`。 (2认同)

Answer 2

Car*_*and 5

def indices_of_matches(str, target)
  sz = target.size
  (0..str.size-sz).select { |i| str[i,sz] == target }
end

indices_of_matches('Einstein', 'in')
  #=> [1, 6]
indices_of_matches('nnnn', 'nn')
  #=> [0, 1, 2]

Run Code Online (Sandbox Code Playgroud)

第二个例子反映了我对重叠字符串的处理所做的假设。如果不考虑重叠字符串（即[0, 2]是第二个示例中所需的返回值），则此答案显然不合适。

Answer 3

Eri*_*nil 5

这是一个更冗长的解决方案，它带来了不依赖全局值的优势：

def indices(string, regex)
  position = 0
  Enumerator.new do |yielder|
    while match = regex.match(string, position)
      yielder << match.begin(0)
      position = match.end(0)
    end
  end
end

p indices("Einstein", /in/).to_a
# [1, 6]

Run Code Online (Sandbox Code Playgroud)

它输出一个Enumerator，所以你也可以懒惰地使用它或者只使用第n一个索引。

此外，如果您可能需要比索引更多的信息，您可以返回一个EnumeratorofMatchData并提取索引：

def matches(string, regex)
  position = 0
  Enumerator.new do |yielder|
    while match = regex.match(string, position)
      yielder << match
      position = match.end(0)
    end
  end
end

p matches("Einstein", /in/).map{ |match| match.begin(0) }
# [1, 6]

Run Code Online (Sandbox Code Playgroud)

要获得@Cary 描述的行为，您可以将块中的最后一行替换为position = match.begin(0) + 1.

归档时间：	8 年，10 月前
查看次数：	1873 次
最近记录：	8 年，10 月前