我有一个使用的解释变量,scale()用于预测响应变量:
d <- data.frame(
x=runif(100),
y=rnorm(100)
)
d <- within(d, s.x <- scale(x))
m1 <- lm(y~s.x, data=d)
Run Code Online (Sandbox Code Playgroud)
我想绘制预测值,但使用原始比例x而不是中心比例.有没有办法进行反向变换或反向缩放s.x?
谢谢!
Jus*_*tin 55
看一眼:
attributes(d$s.x)
Run Code Online (Sandbox Code Playgroud)
您可以使用属性来取消缩放:
d$s.x * attr(d$s.x, 'scaled:scale') + attr(d$s.x, 'scaled:center')
Run Code Online (Sandbox Code Playgroud)
例如:
> x <- 1:10
> s.x <- scale(x)
> s.x
[,1]
[1,] -1.4863011
[2,] -1.1560120
[3,] -0.8257228
[4,] -0.4954337
[5,] -0.1651446
[6,] 0.1651446
[7,] 0.4954337
[8,] 0.8257228
[9,] 1.1560120
[10,] 1.4863011
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765
> s.x * attr(s.x, 'scaled:scale') + attr(s.x, 'scaled:center')
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765
Run Code Online (Sandbox Code Playgroud)
Fer*_*ndo 12
对于数据框或矩阵:
set.seed(1)
x = matrix(sample(1:12), ncol= 3)
xs = scale(x, center = TRUE, scale = TRUE)
x.orig = t(apply(xs, 1, function(r)r*attr(xs,'scaled:scale') + attr(xs, 'scaled:center')))
print(x)
[,1] [,2] [,3]
[1,] 4 2 3
[2,] 5 7 1
[3,] 6 10 11
[4,] 9 12 8
print(x.orig)
[,1] [,2] [,3]
[1,] 4 2 3
[2,] 5 7 1
[3,] 6 10 11
[4,] 9 12 8
Run Code Online (Sandbox Code Playgroud)
使用以下功能时要小心identical():
print(x - x.orig)
[,1] [,2] [,3]
[1,] 0 0 0.000000e+00
[2,] 0 0 8.881784e-16
[3,] 0 0 0.000000e+00
[4,] 0 0 0.000000e+00
identical(x, x.orig)
# FALSE
Run Code Online (Sandbox Code Playgroud)
我觉得这应该是一个合适的功能,这是我的尝试:
#' Reverse a scale
#'
#' Computes x = sz+c, which is the inverse of z = (x - c)/s
#' provided by the \code{scale} function.
#'
#' @param z a numeric matrix(like) object
#' @param center either NULL or a numeric vector of length equal to the number of columns of z
#' @param scale either NULL or a a numeric vector of length equal to the number of columns of z
#'
#' @seealso \code{\link{scale}}
#' mtcs <- scale(mtcars)
#'
#' all.equal(
#' unscale(mtcs),
#' as.matrix(mtcars),
#' check.attributes=FALSE
#' )
#'
#' @export
unscale <- function(z, center = attr(z, "scaled:center"), scale = attr(z, "scaled:scale")) {
if(!is.null(scale)) z <- sweep(z, 2, scale, `*`)
if(!is.null(center)) z <- sweep(z, 2, center, `+`)
structure(z,
"scaled:center" = NULL,
"scaled:scale" = NULL,
"unscaled:center" = center,
"unscaled:scale" = scale
)
}
Run Code Online (Sandbox Code Playgroud)
tl;博士:
unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')
Run Code Online (Sandbox Code Playgroud)
xs是由创建的缩放对象scale(x)仅适用于那些试图对此有所了解的人:
R 如何缩放:
该scale函数默认执行缩放和居中。
centering首先执行。默认情况下,通过!is.na从每个值中减去所有输入值的平均值来实现居中:
data - mean(data, rm.na = T)
Run Code Online (Sandbox Code Playgroud)
缩放是通过以下方式实现的:
sqrt( ( sum(x^2) ) / n - 1)
Run Code Online (Sandbox Code Playgroud)
其中x是!is.na要缩放的所有值的集合和n= length(x)。
但重要的是,当center =Tin 时scale,x不是原始数据集,而是已经居中的数据。
所以如果center = T(默认),缩放函数真的在计算:
sqrt( ( sum( (data - mean(data, rm.na = T))^2) ) / n - 1)
Run Code Online (Sandbox Code Playgroud)
center = T] 这与取标准差相同:sd(data)。如何取消缩放:
说明:
首先乘以比例因子:
y = x * sqrt( ( sum( (x - mean(x , na.rm = T))^2) ) / (length(x) - 1))
Run Code Online (Sandbox Code Playgroud)
然后加回意思:
y + mean(x , na.rm = T)
Run Code Online (Sandbox Code Playgroud)
显然,您需要知道此手动方法的原始数据集的平均值才能真正有用,但我将其放在这里是为了概念上的考虑。
幸运的是,正如之前的答案所示,“居中”值(即均值)位于scale对象的属性中,因此这种方法可以简化为:
如何在 R 中做:
unscaled_vals <- xs + attr(xs, 'scaled:scale') + attr(xs, 'scaled:center')
Run Code Online (Sandbox Code Playgroud)
xs是由scale(x).小智 6
我遇到了这个问题,我想我使用线性代数找到了一个更简单的解决方案。
# create matrix like object
a <- rnorm(1000,5,2)
b <- rnorm(1000,7,5)
df <- cbind(a,b)
# get center and scaling values
mean <- apply(df, 2, mean)
sd <- apply(df, 2, sd)
# scale data
s.df <- scale(df, center = mean, scale = sd)
#unscale data with linear algebra
us.df <- t((t(s.df) * sd) + mean)
Run Code Online (Sandbox Code Playgroud)