给定字符集的所有可能序列

Ian*_*Ian 4 string r sequence tidyr

我有以下字符串:

'[ABC][abcd][XYZ]'
Run Code Online (Sandbox Code Playgroud)

我想生成所有可能的字符串,其中第一个字符是 A、B 或 C,第二个字符是 a、b、c 或 d,第三个字符是 X、Y 或 Z。

示例:AcX、BaZ 等。

如何做到这一点,最好是在 Tidyverse 中?

jay*_*.sf 7

首先splitstr适当地使用字符串来获取列表,然后使用expand.gridand paste0with do.call

el(strsplit('[ABC][abcd][XYZ]', '[\\[|\\]]', perl=TRUE)) |>
  {\(x) x[x != '']}() |>
  sapply(strsplit, '') |>
  do.call(what=expand.grid) |>
  do.call(what=paste0)
# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
# [21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
Run Code Online (Sandbox Code Playgroud)


Maë*_*aël 6

一个stringr办法:

library(stringr)
str_extract_all(x,"(?<=\\[).+?(?=\\])", simplify = TRUE) |>
  str_split("") |>
  expand.grid() |>
  do.call(what = paste0)

# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
#[21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
Run Code Online (Sandbox Code Playgroud)

这也有效,使用interaction

library(stringr)
str_extract_all(x,"(?<=\\[).+?(?=\\])", simplify = TRUE) |>
  str_split("") |>
  interaction(sep = "") |> levels()

# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
#[21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
Run Code Online (Sandbox Code Playgroud)