backtransform`scale()`用于绘图

smi*_*lig 41 r

我有一个使用的解释变量,scale()用于预测响应变量:

d <- data.frame(
  x=runif(100),
  y=rnorm(100)
)

d <- within(d, s.x <- scale(x))

m1 <- lm(y~s.x, data=d)
Run Code Online (Sandbox Code Playgroud)

我想绘制预测值,但使用原始比例x而不是中心比例.有没有办法进行反向变换或反向缩放s.x

谢谢!

Jus*_*tin 55

看一眼:

attributes(d$s.x)
Run Code Online (Sandbox Code Playgroud)

您可以使用属性来取消缩放:

d$s.x * attr(d$s.x, 'scaled:scale') + attr(d$s.x, 'scaled:center')
Run Code Online (Sandbox Code Playgroud)

例如:

> x <- 1:10
> s.x <- scale(x)

> s.x
            [,1]
 [1,] -1.4863011
 [2,] -1.1560120
 [3,] -0.8257228
 [4,] -0.4954337
 [5,] -0.1651446
 [6,]  0.1651446
 [7,]  0.4954337
 [8,]  0.8257228
 [9,]  1.1560120
[10,]  1.4863011
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765

> s.x * attr(s.x, 'scaled:scale') + attr(s.x, 'scaled:center')
      [,1]
 [1,]    1
 [2,]    2
 [3,]    3
 [4,]    4
 [5,]    5
 [6,]    6
 [7,]    7
 [8,]    8
 [9,]    9
[10,]   10
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765
Run Code Online (Sandbox Code Playgroud)

  • 很好的回应+1 `attr(sx, 'scaled:center')` 应该是 `attr(d$sx, 'scaled:center')` 吗? (2认同)

Fer*_*ndo 12

对于数据框或矩阵:

set.seed(1)
x = matrix(sample(1:12), ncol= 3)
xs = scale(x, center = TRUE, scale = TRUE)

x.orig = t(apply(xs, 1, function(r)r*attr(xs,'scaled:scale') + attr(xs, 'scaled:center')))

print(x)
     [,1] [,2] [,3]
[1,]    4    2    3
[2,]    5    7    1
[3,]    6   10   11
[4,]    9   12    8

print(x.orig)
     [,1] [,2] [,3]
[1,]    4    2    3
[2,]    5    7    1
[3,]    6   10   11
[4,]    9   12    8
Run Code Online (Sandbox Code Playgroud)

使用以下功能时要小心identical():

print(x - x.orig)
     [,1] [,2]         [,3]
[1,]    0    0 0.000000e+00
[2,]    0    0 8.881784e-16
[3,]    0    0 0.000000e+00
[4,]    0    0 0.000000e+00

identical(x, x.orig)
# FALSE
Run Code Online (Sandbox Code Playgroud)

  • 谢谢!这帮助我在使用缩放矩阵进行 kMeans 聚类后计算回聚类 *centers*。`centers &lt;- t(apply(clustering$centers, 1, function(r) r * attr(scaled_mat, 'scaled:scale') + attr(scaled_mat, 'scaled:center')))` 接受的答案没有. (2认同)

Nea*_*ltz 7

我觉得这应该是一个合适的功能,这是我的尝试:

#' Reverse a scale
#'
#' Computes x = sz+c, which is the inverse of z = (x - c)/s 
#' provided by the \code{scale} function.
#' 
#' @param z a numeric matrix(like) object
#' @param center either NULL or a numeric vector of length equal to the number of columns of z  
#' @param scale  either NULL or a a numeric vector of length equal to the number of columns of z
#'
#' @seealso \code{\link{scale}}
#'  mtcs <- scale(mtcars)
#'  
#'  all.equal(
#'    unscale(mtcs), 
#'    as.matrix(mtcars), 
#'    check.attributes=FALSE
#'  )
#'  
#' @export
unscale <- function(z, center = attr(z, "scaled:center"), scale = attr(z, "scaled:scale")) {
  if(!is.null(scale))  z <- sweep(z, 2, scale, `*`)
  if(!is.null(center)) z <- sweep(z, 2, center, `+`)
  structure(z,
    "scaled:center"   = NULL,
    "scaled:scale"    = NULL,
    "unscaled:center" = center,
    "unscaled:scale"  = scale
  )
}
Run Code Online (Sandbox Code Playgroud)


the*_*ist 6

tl;博士:

unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')
Run Code Online (Sandbox Code Playgroud)
  • 哪里xs是由创建的缩放对象scale(x)

仅适用于那些试图对此有所了解的人:

R 如何缩放

scale函数默认执行缩放和居中。

  • 在这两者中,函数centering首先执行。

默认情况下,通过!is.na从每个值中减去所有输入值的平均值来实现居中:

data - mean(data, rm.na = T)
Run Code Online (Sandbox Code Playgroud)

缩放是通过以下方式实现的:

sqrt( ( sum(x^2) ) / n - 1)
Run Code Online (Sandbox Code Playgroud)

其中x!is.na要缩放的所有值的集合和n= length(x)

  • 但重要的是,当center =Tin 时scalex不是原始数据集,而是已经居中的数据。

    所以如果center = T(默认),缩放函数真的在计算:

     sqrt( ( sum( (data - mean(data, rm.na = T))^2) ) / n - 1)
    
    Run Code Online (Sandbox Code Playgroud)
    • 注意:[when center = T] 这与取标准差相同:sd(data)

如何取消缩放

说明

  1. 首先乘以比例因子:

    y = x * sqrt( ( sum( (x - mean(x , na.rm = T))^2) ) / (length(x) - 1))
    
    Run Code Online (Sandbox Code Playgroud)
  2. 然后加回意思:

    y + mean(x , na.rm = T)
    
    Run Code Online (Sandbox Code Playgroud)

显然,您需要知道此手动方法的原始数据集的平均值才能真正有用,但我将其放在这里是为了概念上的考虑。

幸运的是,正如之前的答案所示,“居中”值(即均值)位于scale对象的属性中,因此这种方法可以简化为:

如何在 R 中做

unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')
Run Code Online (Sandbox Code Playgroud)
  • 哪里xs是由scale(x).

  • 在 `unscaled_vals &lt;- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')` 中,您添加的是偏差而不是乘法。我试图编辑,但它不允许我因为它的变化太小 xD (2认同)

小智 6

我遇到了这个问题,我想我使用线性代数找到了一个更简单的解决方案。

# create matrix like object
a <- rnorm(1000,5,2)
b <- rnorm(1000,7,5) 

df <- cbind(a,b)

# get center and scaling values 
mean <- apply(df, 2, mean)
sd <- apply(df, 2, sd)

# scale data
s.df <- scale(df, center = mean, scale = sd)

#unscale data with linear algebra 
us.df <- t((t(s.df) * sd) + mean)
Run Code Online (Sandbox Code Playgroud)