I have 60 lines of code. Throughout the code there are several calls to random number generators including rnorm(). Is it enough to put set.seed(x) in the very beginning of the code or do I need to set.seed every time random number generation occurs in the code?
这实际上取决于您如何预见未来代码的变化.
如果您希望在代码中较早的位置包含命令,这些命令需要生成随机数,并且您希望在插入代码之前复制先前获得的结果,则应set.seed()在代码中的适当位置使用.
例:
set.seed(1)
A <- rnorm(10)
B <- rnorm(10)
C <- rnorm(10) ## I always want "C" to be the results I get here
set.seed(1)
AA <- rnorm(10); BB <- rnorm(10); CC <- rnorm(10)
identical(A, AA)
# [1] TRUE
identical(B, BB)
# [1] TRUE
identical(C, CC)
# [1] TRUE
set.seed(1)
A <- rnorm(10); B <- rnorm(10); C <- rnorm(10)
set.seed(1)
AA <- rnorm(10); BB <- rnorm(10); BA <- rnorm(10); CC <- rnorm(10)
identical(A, AA)
# [1] TRUE
identical(B, BB)
# [1] TRUE
identical(C, CC)
# [1] FALSE
Run Code Online (Sandbox Code Playgroud)
在上面,如果我希望"C"总是相同,无论前面是什么,我都应该在此之前设置种子.
注意,由于我没有创建之前重置种子C或CC,且有一个新的功能要求之间的随机数生成BB和CC在第二示例中,为对值C和CC现在不同.如果你想他们是一样的,你将不得不插入另一set.seed只创建之前C和CC,如下所示:
set.seed(1)
A <- rnorm(10)
B <- rnorm(10)
set.seed(2)
C <- rnorm(10) ## I always want "C" to be the results I get here
set.seed(1)
AA <- rnorm(10); BB <- rnorm(10); BA <- rnorm(10);
set.seed(2)
CC <- rnorm(10)
identical(A, AA)
# [1] TRUE
identical(B, BB)
# [1] TRUE
identical(C, CC)
# [1] TRUE
Run Code Online (Sandbox Code Playgroud)