将匹配的正则表达式组提取到 Scala 中的数组

Wil*_*ill 5 regex arrays scala extract regex-group

我遇到了这个问题。我有一个

val line:String = "PE018201804527901"
Run Code Online (Sandbox Code Playgroud)

与此匹配

regex : (.{2})(.{4})(.{9})(.{2})
Run Code Online (Sandbox Code Playgroud)

我需要从正则表达式中提取每个组到一个数组。

结果将是:

Array["PE", "0182","018045279","01"]
Run Code Online (Sandbox Code Playgroud)

我尝试做这个正则表达式:

val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val x= regex.findAllIn(line).toArray
Run Code Online (Sandbox Code Playgroud)

但它不起作用!

she*_*nis 5

regex.findAllIn(line).subgroups.toArray
Run Code Online (Sandbox Code Playgroud)


Wik*_*żew 5

请注意,findAllIn不会自动锚定正则表达式模式,并且会在更长的字符串中找到匹配项。如果您只需要允许 17 个字符字符串内的匹配,您可以使用如下的匹配块:

val line = "PE018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = line match {
  case regex(g1, g2, g3, g4) => Array(g1, g2, g3, g4)
  case _ => Array[String]()
}
// Demo printing
results.foreach { m =>
  println(m)
} 
// PE
// 0182
// 018045279
// 01
Run Code Online (Sandbox Code Playgroud)

请参阅Scala 演示

它还可以很好地处理没有匹配的情况,初始化一个空字符串数组。

如果您需要获取所有匹配项和所有组,那么您需要将组抓取到列表中,然后将列表添加到列表缓冲区 ( scala.collection.mutable.ListBuffer):

val line = "PE018201804527901%E018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = ListBuffer[List[String]]()

val mi = regex.findAllIn(line)
while (mi.hasNext) {
  val d = mi.next
  results += List(mi.group(1), mi.group(2), mi.group(3), mi.group(4))
}
// Demo printing
results.foreach { m =>
  println("------")
  println(m)
  m.foreach { l => println(l) }
}
Run Code Online (Sandbox Code Playgroud)

结果:

------
List(PE, 0182, 018045279, 01)
PE
0182
018045279
01
------
List(%E, 0182, 018045279, 01)
%E
0182
018045279
01
Run Code Online (Sandbox Code Playgroud)

请参阅此 Scala 演示


Wil*_*ill 5

你的解决方案@sheunis非常有帮助,最后我用这个方法解决了它:

def extractFromRegex (regex: Regex, line:String): Array[String] = {
   val list =  ListBuffer[String]()
   for(m <- regex.findAllIn(line).matchData;
      e <- m.subgroups)
   list+=e
list.toArray
Run Code Online (Sandbox Code Playgroud)

}

因为您使用此代码的解决方案:

val line:String = """PE0182"""
val regex ="""(.{2})(.{4})""".r  
val t = regex.findAllIn(line).subgroups.toArray
Run Code Online (Sandbox Code Playgroud)

显示下一个异常:

Exception in thread "main" java.lang.IllegalStateException: No match available
at java.util.regex.Matcher.start(Matcher.java:372)
at scala.util.matching.Regex$MatchIterator.start(Regex.scala:696)
at scala.util.matching.Regex$MatchData$class.group(Regex.scala:549)
at scala.util.matching.Regex$MatchIterator.group(Regex.scala:671)
at scala.util.matching.Regex$MatchData$$anonfun$subgroups$1.apply(Regex.scala:553)
at scala.util.matching.Regex$MatchData$$anonfun$subgroups$1.apply(Regex.scala:553)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at scala.util.matching.Regex$MatchData$class.subgroups(Regex.scala:553)
at scala.util.matching.Regex$MatchIterator.subgroups(Regex.scala:671)
Run Code Online (Sandbox Code Playgroud)