将非数字因子转换为数字列,并在R中进行映射

jus*_*nvf 7 r

我有一个因素data frame与像水平hot,warm,tepid,cold,very cold,freezing.我想将它们映射到一个整数列,其值在[-2, 2]回归范围内,一些值映射到同一个东西.我希望能够指定显式映射,以便将very hot单词映射到2,将单词映射very cold-2,等等.如何干净地执行此操作?我想要一个函数,我只是传递一些命名列表,或者其他什么.

Leo*_*Leo 14

假设因子向量x保持类别.

temperatures <- c("hot", "warm", "tepid", "cold", "very cold", "freezing")
set.seed(1)
x <- as.factor(sample(temperatures, 10, replace=TRUE))
x
[1] warm     tepid    cold     freezing warm     freezing freezing cold    
[9] cold     hot     
Levels: cold freezing hot tepid warm
Run Code Online (Sandbox Code Playgroud)

temp.map使用映射创建数字向量.请注意,"热"和"暖"映射到下面相同的值.

temp.map <- c("hot"=2, "warm"=2, "tepid"=1, "cold"=0, "very cold"=-1, "freezing"=-1)    
y <- temp.map[as.character(x)]
y
warm    tepid     cold freezing     warm freezing freezing     cold 
   2        1        0       -1        2       -1       -1        0 
cold      hot 
   0        2 
Run Code Online (Sandbox Code Playgroud)


nic*_*ico 7

使用可以很容易地将因子转换为整数as.integer.

例如:

>temperatures <- c("Hot", "Warm", "Tiepid", "Cold", "Very cold", "Freezing")
> set.seed(12345)
> a <- sample(temperatures, 10, r=T)
> a <- factor(a, levels = temperatures)
> a
 [1] Very cold Freezing  Very cold Freezing  Tiepid    Hot       Warm     
 [8] Cold      Very cold Freezing 
Levels: Hot Warm Tiepid Cold Very cold Freezing
> as.integer(a)
 [1] 5 6 5 6 3 1 2 4 5 6
Run Code Online (Sandbox Code Playgroud)

如果你需要在[-2; 2]范围内,你就可以了

> as.integer(a)-3
  [1]  2  3  2  3  0 -2 -1  1  2  3
Run Code Online (Sandbox Code Playgroud)