我有一个包含如下数据的数据集:
value
[1,] 41601325
[2,] 54917632
[3,] 64616616
[4,] 90791277
[5,] 35335221
[6,] .
. .
. .
Run Code Online (Sandbox Code Playgroud)
我必须使用它缩小到范围 [0,1]
apply(data1, MARGIN = 2, FUN = function(X) (X - min(X))/diff(range(X)))
Run Code Online (Sandbox Code Playgroud)
因为我需要将数据放入包GP_fit()中GPfit。缩小后的值变成:
value
[1,] .4535
[2,] .56355
[3,] .64616
[4,] .70791
[5,] .35563
[6,] .
. .
. .
Run Code Online (Sandbox Code Playgroud)
在应用GP_fit()缩放数据并使用predict()并作为输出后,我得到了新值,这些值再次位于范围 [0,1] 内,如下所示:
value
[1,] .0135
[2,] .234355
[3,] .6716
[4,] .325079
[5,] .95563
[6,] .
. .
. .
Run Code Online (Sandbox Code Playgroud)
但我想把这些带回原来的范围。我怎样才能做到这一点?
基本上我想恢复/返回原始格式以显示输出predict()
注意:原始范围不是固定的,可以变化,但通常可能的最大值约为 2000 万。
更新:我厌倦了实现@JustinFletcher 编写的代码。我的数据是:
value
[1,] 54.2
[2,] 53.8
[3,] 53.9
[4,] 53.8
[5,] 54.9
[6,] 55.0
[7,] 38.5
[8,] 38.0
[9,] 38.1
[10,] 38.0
[11,] 38.8
[12,] 38.9
[13,] 24.3
[14,] 24.1
[15,] 24.3
[16,] 24.1
[17,] 24.4
[18,] 24.4
[19,] 57.3
[20,] 57.2
[21,] 57.6
[22,] 57.7
[23,] 58.1
[24,] 57.9
Run Code Online (Sandbox Code Playgroud)
我写这个是为了在 [0,1] 范围内重新调整它:
data_new <- apply(data_test, MARGIN = 2, FUN = function(X) (X - min(X))/diff(range(X)))
Run Code Online (Sandbox Code Playgroud)
我得到了
value
[1,] 0.885294118
[2,] 0.873529412
[3,] 0.876470588
[4,] 0.873529412
[5,] 0.905882353
[6,] 0.908823529
[7,] 0.423529412
[8,] 0.408823529
[9,] 0.411764706
[10,] 0.408823529
[11,] 0.432352941
[12,] 0.435294118
[13,] 0.005882353
[14,] 0.000000000
[15,] 0.005882353
[16,] 0.000000000
[17,] 0.008823529
[18,] 0.008823529
[19,] 0.976470588
[20,] 0.973529412
[21,] 0.985294118
[22,] 0.988235294
[23,] 1.000000000
[24,] 0.994117647
Run Code Online (Sandbox Code Playgroud)
然后为了将其恢复到原始比例,我写了这样的:
data_revert <- apply(data_new, MARGIN = 2, FUN = function(X, Y) (X + min(Y))*diff(range(Y)), Y=data_test)
Run Code Online (Sandbox Code Playgroud)
我得到了
value
[1,] 849.5
[2,] 849.1
[3,] 849.2
[4,] 849.1
[5,] 850.2
[6,] 850.3
[7,] 833.8
[8,] 833.3
[9,] 833.4
[10,] 833.3
[11,] 834.1
[12,] 834.2
[13,] 819.6
[14,] 819.4
[15,] 819.6
[16,] 819.4
[17,] 819.7
[18,] 819.7
[19,] 852.6
[20,] 852.5
[21,] 852.9
[22,] 853.0
[23,] 853.4
[24,] 853.2
Run Code Online (Sandbox Code Playgroud)
这个输出不正确。
这是简单的代数。要缩放数据,您需要计算
n = (e - e_min)/(e_max - e_min)
Run Code Online (Sandbox Code Playgroud)
现在你需要返回 e,基于任意e_min和e_max。证明这一点是微不足道的
n(e_max - e_min) + e_min = e
Run Code Online (Sandbox Code Playgroud)
例子:
e <- 1:10
n <- (e - min(e))/(max(e) - min(e))
new.e <- (n*(10-1)) + 1
> all(e == new.e)
[1] TRUE
Run Code Online (Sandbox Code Playgroud)