Luk*_*uks 3 string r data-management data.table
我有一个非常大的data.table,其中(大量)项目由包括文本和数字的字符串定义.
library(data.table)
dd <- data.table(x = c("A4","A4","A4","A14","A14","A14","B4","B4","B4"),y = c("A4","A14","B4","A4","A14","B4","A4","A14","B4"), z = c(1,2,3,4,5,6,7,8,9))
x y z
A4 A4 1
A4 A14 2
A4 B4 3
A14 A4 4
A14 A14 5
A14 B4 6
B4 A4 7
B4 A14 8
B4 B4 9
Run Code Online (Sandbox Code Playgroud)
数字可以是单数字或双数字,因此R将始终根据数字中的第一个数字(A4之前的A14)对它们进行排序.Mixedsort可以处理这个问题.但是,当我将长数据重塑为宽数据时
wide <- dcast(dd, x ~ y, value.var = "z")
Run Code Online (Sandbox Code Playgroud)
R根据基本排序规则再次应用排序.
x A14 A4 B4
A14 5 4 6
A4 2 1 3
B4 8 7 9
Run Code Online (Sandbox Code Playgroud)
然而,我需要以下矩阵计算的原始顺序.有没有有效的方法将字符串+单个数字重命名为字符串+双数字(A4 - > A04)或我错过的另一种方法?
另外,可能是最简单的,选择是使用mixedorder
从gtools
-package:
wide <- dcast(dd, x ~ y, value.var = "z")[gtools::mixedorder(x)]
Run Code Online (Sandbox Code Playgroud)
这使:
Run Code Online (Sandbox Code Playgroud)> wide x A14 A4 B4 1: A4 2 1 3 2: A14 5 4 6 3: B4 8 7 9
如果您还想以相同的方式设置列顺序,您还可以使用setcolorder
:
setcolorder(wide, c(1, gtools::mixedorder(names(wide)[-1]) + 1))
Run Code Online (Sandbox Code Playgroud)
然后给出:
Run Code Online (Sandbox Code Playgroud)> wide x A4 A14 B4 1: A4 1 2 3 2: A14 4 5 6 3: B4 7 8 9