在R中,您可以通过指定起点,终点和所需的输出长度来创建序列
seq(1, 1.5, length.out=10)
# [1] 1.000000 1.055556 1.111111 1.166667 1.222222 1.277778 1.333333 1.388889 1.444444 1.500000
Run Code Online (Sandbox Code Playgroud)
在Python中,可以numpy arange类似的方式使用该函数,但是没有简单的方法来指定输出长度。我能想到的最好的:
np.append(np.arange(1, 1.5, step = (1.5-1)/9), 1.5)
# array([ 1. , 1.05555556, 1.11111111, 1.16666667, 1.22222222, 1.27777778, 1.33333333, 1.38888889, 1.44444444, 1.5 ])
Run Code Online (Sandbox Code Playgroud)
有没有更干净的方法来执行此操作?
是! 一个简单的方法是使用numpy.linspace
numpy.linspace(开始,停止,num = 50,端点= True,retstep = False,dtype = None)
返回指定间隔内的等间隔数字。
返回以间隔[开始,停止]计算的num个均匀间隔的样本。
间隔的端点可以选择排除。
例:
[In 1] np.linspace(start=0, stop=50, num=5)
[Out 1] array([ 0. , 12.5, 25. , 37.5, 50. ])
Run Code Online (Sandbox Code Playgroud)
请注意,起始值和终止值之间的距离是均匀间隔的,即均匀地除以num=5。
对于那些安装numpy时遇到问题的人(如今这是一个较不常见的问题),您可以考虑使用anaconda(或miniconda)或其他类似的发行版。
作为一种替代方案(对于那些感兴趣的人来说),如果有人想要 R 的功能seq(start, end, by, length.out),以下函数提供了完整的功能。
def seq(start, end, by = None, length_out = None):\n len_provided = True if (length_out is not None) else False\n by_provided = True if (by is not None) else False\n if (not by_provided) & (not len_provided):\n raise ValueError(\'At least by or length_out must be provided\')\n width = end - start\n eps = pow(10.0, -14)\n if by_provided:\n if (abs(by) < eps):\n raise ValueError(\'by must be non-zero.\')\n #Switch direction in case in start and end seems to have been switched (use sign of by to decide this behaviour)\n if start > end and by > 0:\n e = start\n start = end\n end = e\n elif start < end and by < 0:\n e = end\n end = start\n start = e\n absby = abs(by)\n if absby - width < eps: \n length_out = int(width / absby)\n else: \n #by is too great, we assume by is actually length_out\n length_out = int(by)\n by = width / (by - 1)\n else:\n length_out = int(length_out)\n by = width / (length_out - 1) \n out = [float(start)]*length_out\n for i in range(1, length_out):\n out[i] += by * i\n if abs(start + by * length_out - end) < eps:\n out.append(end)\n return out\nRun Code Online (Sandbox Code Playgroud)\n这个函数比它慢一点numpy.linspace(大约快 4 倍到 5 倍),但是使用numba 的速度我们可以获得大约 2 倍的速度,同时np.linspace保留 R 的语法。
from numba import jit\n@jit(nopython = True, fastmath = True)\ndef seq(start, end, by = None, length_out = None):\n [function body]\nRun Code Online (Sandbox Code Playgroud)\n我们可以像在 R 中一样执行它。
\nseq(0, 5, 0.3)\n#out: [3.0, 3.3, 3.6, 3.9, 4.2, 4.5, 4.8]\nRun Code Online (Sandbox Code Playgroud)\n在上面的实现中,它还允许(在某种程度上)在“by”和“length_out”之间进行交换
\nseq(0, 5, 10)\n#out: [0.0,\n 0.5555555555555556,\n 1.1111111111111112,\n 1.6666666666666667,\n 2.2222222222222223,\n 2.7777777777777777,\n 3.3333333333333335,\n 3.8888888888888893,\n 4.444444444444445,\n 5.0]\nRun Code Online (Sandbox Code Playgroud)\n%timeit -r 100 py_seq(0.5, 1, 1000) #Python no jit\n133 \xc2\xb5s \xc2\xb1 20.9 \xc2\xb5s per loop (mean \xc2\xb1 std. dev. of 100 runs, 1000 loops each)\n\n%timeit -r 100 seq(0.5, 1, 1000) #adding @jit(nopython = True, fastmath = True) prior to function definition\n20.1 \xc2\xb5s \xc2\xb1 2 \xc2\xb5s per loop (mean \xc2\xb1 std. dev. of 100 runs, 10000 loops each)\n\n%timeit -r 100 linspace(0.5, 1, 1000)\n46.2 \xc2\xb5s \xc2\xb1 6.11 \xc2\xb5s per loop (mean \xc2\xb1 std. dev. of 100 runs, 10000 loops each)\nRun Code Online (Sandbox Code Playgroud)\n