重复矢量的字母

Joh*_*ler 11 r

是否有在R中创建重复字母列表的功能?

就像是

letters[1:30]
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z" NA  NA  NA  NA
Run Code Online (Sandbox Code Playgroud)

但不是NA,我希望输出继续aa,bb,cc,dd ......

A5C*_*2T1 8

将快速函数组合在一起做这样的事情并不困难:

myLetters <- function(length.out) {
  a <- rep(letters, length.out = length.out)
  grp <- cumsum(a == "a")
  vapply(seq_along(a), 
         function(x) paste(rep(a[x], grp[x]), collapse = ""),
         character(1L))
}
myLetters(60)
#  [1] "a"   "b"   "c"   "d"   "e"   "f"   "g"   "h"   "i"   "j"   "k"   "l"  
# [13] "m"   "n"   "o"   "p"   "q"   "r"   "s"   "t"   "u"   "v"   "w"   "x"  
# [25] "y"   "z"   "aa"  "bb"  "cc"  "dd"  "ee"  "ff"  "gg"  "hh"  "ii"  "jj" 
# [37] "kk"  "ll"  "mm"  "nn"  "oo"  "pp"  "qq"  "rr"  "ss"  "tt"  "uu"  "vv" 
# [49] "ww"  "xx"  "yy"  "zz"  "aaa" "bbb" "ccc" "ddd" "eee" "fff" "ggg" "hhh"
Run Code Online (Sandbox Code Playgroud)


Mat*_*rde 8

如果您只想要唯一的名称,可以使用

make.unique(rep(letters, length.out = 30), sep='')
Run Code Online (Sandbox Code Playgroud)

编辑:

这是另一种使用重复字母的方法Reduce.

myletters <- function(n) 
unlist(Reduce(paste0, 
       replicate(n %/% length(letters), letters, simplify=FALSE),
       init=letters,
       accumulate=TRUE))[1:n]

myletters(60)
#  [1] "a"   "b"   "c"   "d"   "e"   "f"   "g"   "h"   "i"   "j"   "k"   "l"  
# [13] "m"   "n"   "o"   "p"   "q"   "r"   "s"   "t"   "u"   "v"   "w"   "x"  
# [25] "y"   "z"   "aa"  "bb"  "cc"  "dd"  "ee"  "ff"  "gg"  "hh"  "ii"  "jj" 
# [37] "kk"  "ll"  "mm"  "nn"  "oo"  "pp"  "qq"  "rr"  "ss"  "tt"  "uu"  "vv" 
# [49] "ww"  "xx"  "yy"  "zz"  "aaa" "bbb" "ccc" "ddd" "eee" "fff" "ggg" "hhh"
Run Code Online (Sandbox Code Playgroud)


Gre*_*gor 6

工作方案

用于生成Excel样式列名的函数,即

# A, B, ..., Z, AA, AB, ..., AZ, BA, BB, ..., ..., ZZ, AAA, ...

letterwrap <- function(n, depth = 1) {
    args <- lapply(1:depth, FUN = function(x) return(LETTERS))
    x <- do.call(expand.grid, args = list(args, stringsAsFactors = F))
    x <- x[, rev(names(x)), drop = F]
    x <- do.call(paste0, x)
    if (n <= length(x)) return(x[1:n])
    return(c(x, letterwrap(n - length(x), depth = depth + 1)))
}

letterwrap(26^2 + 52) # through AAZ
Run Code Online (Sandbox Code Playgroud)

艰难的尝试

最初我认为最好通过转换到26来巧妙地完成,但这不起作用.问题是Excel列名不是26,这花了我很长时间才意识到.捕获为0:如果你试图将一个字母(如A)映射到0,当你想区分AAAAAA... 时,你就遇到了问题.

另一种说明问题的方法是"数字".在基数10中,有10个单位数字(0-9),然后是90个两位数字(10:99),900个三位数字......概括为10^d - 10^(d - 1)数字的d数字d > 1.但是,在Excel列名称中,有26个单字母名称,26 ^ 2个双字母名称,26 ^ 3个三字母名称,没有减法.

我将此代码作为警告留给其他人:

## Converts a number to base 26, returns a vector for each "digit"
b26 <- function(n) {
    stopifnot(n >= 0)
    if (n <= 1) return(n)
    n26 <- rep(NA, ceiling(log(n, base = 26)))
    for (i in seq_along(n26)) {
        n26[i] <- (n %% 26)
        n <- n %/% 26
    }
    return(rev(n26))
}

## Returns the name of nth value in the sequence
## A, B, C, ..., Z, AA, AB, AC, ..., AZ, BA, ...
letterwrap1 <- function(n, lower = FALSE) {
    let <- if (lower) letters else LETTERS
    base26 <- b26(n)
    base26[base26 == 0] <- 26
    paste(let[base26], collapse = "")
}

## Vectorized version of letterwrap
letter_col_names <- Vectorize(letterwrap, vectorize.args="n")

> letter_col_names(1:4)
[1] "A" "B" "C" "D"

> letter_col_names(25:30)
[1] "Y"  "Z"  "AA" "AB" "AC" "AD"

# Looks pretty good
# Until we get here:
> letter_col_names(50:54)
[1] "AX" "AY" "BZ" "BA" "BB"
Run Code Online (Sandbox Code Playgroud)