在字符串中均匀使用单个和两个数字的数字

Luk*_*uks 3 string r data-management data.table

我有一个非常大的data.table,其中(大量)项目由包括文本和数字的字符串定义.

library(data.table)    
dd <- data.table(x = c("A4","A4","A4","A14","A14","A14","B4","B4","B4"),y = c("A4","A14","B4","A4","A14","B4","A4","A14","B4"), z = c(1,2,3,4,5,6,7,8,9))

x   y   z
A4  A4  1
A4  A14 2
A4  B4  3
A14 A4  4
A14 A14 5
A14 B4  6
B4  A4  7
B4  A14 8
B4  B4  9
Run Code Online (Sandbox Code Playgroud)

数字可以是单数字或双数字,因此R将始终根据数字中的第一个数字(A4之前的A14)对它们进行排序.Mixedsort可以处理这个问题.但是,当我将长数据重塑为宽数据时

wide <- dcast(dd, x ~ y, value.var = "z")
Run Code Online (Sandbox Code Playgroud)

R根据基本排序规则再次应用排序.

x    A14  A4  B4
A14  5    4   6
A4   2    1   3
B4   8    7   9
Run Code Online (Sandbox Code Playgroud)

然而,我需要以下矩阵计算的原始顺序.有没有有效的方法将字符串+单个数字重命名为字符串+双数字(A4 - > A04)或我错过的另一种方法?

Jaa*_*aap 5

另外,可能是最简单的,选择是使用mixedordergtools-package:

wide <- dcast(dd, x ~ y, value.var = "z")[gtools::mixedorder(x)]
Run Code Online (Sandbox Code Playgroud)

这使:

> wide
     x A14 A4 B4
1:  A4   2  1  3
2: A14   5  4  6
3:  B4   8  7  9
Run Code Online (Sandbox Code Playgroud)

如果您还想以相同的方式设置列顺序,您还可以使用setcolorder:

setcolorder(wide, c(1, gtools::mixedorder(names(wide)[-1]) + 1))
Run Code Online (Sandbox Code Playgroud)

然后给出:

> wide
     x A4 A14 B4
1:  A4  1   2  3
2: A14  4   5  6
3:  B4  7   8  9
Run Code Online (Sandbox Code Playgroud)