Ian*_*Ian 4 string r sequence tidyr
我有以下字符串:
'[ABC][abcd][XYZ]'
Run Code Online (Sandbox Code Playgroud)
我想生成所有可能的字符串,其中第一个字符是 A、B 或 C,第二个字符是 a、b、c 或 d,第三个字符是 X、Y 或 Z。
示例:AcX、BaZ 等。
如何做到这一点,最好是在 Tidyverse 中?
首先splitstr适当地使用字符串来获取列表,然后使用expand.gridand paste0with do.call。
el(strsplit('[ABC][abcd][XYZ]', '[\\[|\\]]', perl=TRUE)) |>
{\(x) x[x != '']}() |>
sapply(strsplit, '') |>
do.call(what=expand.grid) |>
do.call(what=paste0)
# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
# [21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
Run Code Online (Sandbox Code Playgroud)
一个stringr办法:
library(stringr)
str_extract_all(x,"(?<=\\[).+?(?=\\])", simplify = TRUE) |>
str_split("") |>
expand.grid() |>
do.call(what = paste0)
# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
#[21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
Run Code Online (Sandbox Code Playgroud)
这也有效,使用interaction:
library(stringr)
str_extract_all(x,"(?<=\\[).+?(?=\\])", simplify = TRUE) |>
str_split("") |>
interaction(sep = "") |> levels()
# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
#[21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
Run Code Online (Sandbox Code Playgroud)