当参数保持不变时,最大限度地减少代价高昂的函数调用次数(python)

use*_*473 3 python optimization

假设有一个函数costly_function_a(x):

  1. 它的执行时间非常昂贵;
  2. 只要输入相同的输出,它就会返回相同的输出x; 和
  3. 除了返回输出之外,它不执行"附加任务".

在这些条件下,x我们可以将结果存储在临时变量中,然后使用该变量进行这些计算,而不是连续两次调用函数.

现在假设有一些功能(f(x),g(x)h(x)调用在下面的示例)costly_function_a(x),并且其中的一些功能可以调用彼此(在下面的例子中,g(x)h(x)两个呼叫f(x)).在这种情况下,使用在重复调用以上仍结果提到的简单的方法costly_function_a(x)用相同的x(见OkayVersion下文).我确实找到了一种最小化呼叫次数的方法,但它"丑陋"(见FastVersion下文).有没有更好的方法来做到这一点?

#Dummy functions representing extremely slow code.
#The goal is to call these costly functions as rarely as possible.
def costly_function_a(x):
    print("costly_function_a has been called.")
    return x #Dummy operation.
def costly_function_b(x):
    print("costly_function_b has been called.")
    return 5.*x #Dummy operation.

#Simplest (but slowest) implementation.
class SlowVersion:
    def __init__(self,a,b):
        self.a = a
        self.b = b
    def f(self,x): #Dummy operation.
        return self.a(x) + 2.*self.a(x)**2
    def g(self,x): #Dummy operation.
        return self.f(x) + 0.7*self.a(x) + .1*x
    def h(self,x): #Dummy operation.
        return self.f(x) + 0.5*self.a(x) + self.b(x) + 3.*self.b(x)**2

#Equivalent to SlowVersion, but call the costly functions less often.
class OkayVersion:
    def __init__(self,a,b):
        self.a = a
        self.b = b
    def f(self,x): #Same result as SlowVersion.f(x)
        a_at_x = self.a(x)
        return a_at_x + 2.*a_at_x**2
    def g(self,x): #Same result as SlowVersion.g(x)
        return self.f(x) + 0.7*self.a(x) + .1*x
    def h(self,x): #Same result as SlowVersion.h(x)
        a_at_x = self.a(x)
        b_at_x = self.b(x)
        return self.f(x) + 0.5*a_at_x + b_at_x + 3.*b_at_x**2

#Equivalent to SlowVersion, but calls the costly functions even less often.
#Is this the simplest way to do it? I am aware that this code is highly
#redundant. One could simplify it by defining some factory functions...
class FastVersion:
    def __init__(self,a,b):
        self.a = a
        self.b = b
    def f(self, x, _at_x=None): #Same result as SlowVersion.f(x)
        if _at_x is None:
            _at_x = dict()
        if 'a' not in _at_x:
            _at_x['a'] = self.a(x)
        return _at_x['a'] + 2.*_at_x['a']**2
    def g(self, x, _at_x=None): #Same result as SlowVersion.g(x)
        if _at_x is None:
            _at_x = dict()
        if 'a' not in _at_x:
            _at_x['a'] = self.a(x)
        return self.f(x,_at_x) + 0.7*_at_x['a'] + .1*x
    def h(self,x,_at_x=None): #Same result as SlowVersion.h(x)
        if _at_x is None:
            _at_x = dict()
        if 'a' not in _at_x:
            _at_x['a'] = self.a(x)
        if 'b' not in _at_x:
            _at_x['b'] = self.b(x)
        return self.f(x,_at_x) + 0.5*_at_x['a'] + _at_x['b'] + 3.*_at_x['b']**2

if __name__ == '__main__':

    slow = SlowVersion(costly_function_a,costly_function_b)
    print("Using slow version.")
    print("f(2.) = " + str(slow.f(2.)))
    print("g(2.) = " + str(slow.g(2.)))
    print("h(2.) = " + str(slow.h(2.)) + "\n")

    okay = OkayVersion(costly_function_a,costly_function_b)
    print("Using okay version.")
    print("f(2.) = " + str(okay.f(2.)))
    print("g(2.) = " + str(okay.g(2.)))
    print("h(2.) = " + str(okay.h(2.)) + "\n")

    fast = FastVersion(costly_function_a,costly_function_b)
    print("Using fast version 'casually'.")
    print("f(2.) = " + str(fast.f(2.)))
    print("g(2.) = " + str(fast.g(2.)))
    print("h(2.) = " + str(fast.h(2.)) + "\n")

    print("Using fast version 'optimally'.")
    _at_x = dict()
    print("f(2.) = " + str(fast.f(2.,_at_x)))
    print("g(2.) = " + str(fast.g(2.,_at_x)))
    print("h(2.) = " + str(fast.h(2.,_at_x)))
    #Of course, one must "clean up" _at_x before using a different x...
Run Code Online (Sandbox Code Playgroud)

此代码的输出是:

Using slow version.
costly_function_a has been called.
costly_function_a has been called.
f(2.) = 10.0
costly_function_a has been called.
costly_function_a has been called.
costly_function_a has been called.
g(2.) = 11.6
costly_function_a has been called.
costly_function_a has been called.
costly_function_a has been called.
costly_function_b has been called.
costly_function_b has been called.
h(2.) = 321.0

Using okay version.
costly_function_a has been called.
f(2.) = 10.0
costly_function_a has been called.
costly_function_a has been called.
g(2.) = 11.6
costly_function_a has been called.
costly_function_b has been called.
costly_function_a has been called.
h(2.) = 321.0

Using fast version 'casually'.
costly_function_a has been called.
f(2.) = 10.0
costly_function_a has been called.
g(2.) = 11.6
costly_function_a has been called.
costly_function_b has been called.
h(2.) = 321.0

Using fast version 'optimally'.
costly_function_a has been called.
f(2.) = 10.0
g(2.) = 11.6
costly_function_b has been called.
h(2.) = 321.0
Run Code Online (Sandbox Code Playgroud)

请注意,我不想"存储" x过去使用的所有值的结果(因为这将需要太多内存).此外,我不希望有一个函数返回表单的元组,(f,g,h)因为有些情况我只想要f(所以不需要评估costly_function_b).

Mar*_*ers 11

您正在寻找的是LRU缓存; 只缓存最近使用的项目,限制内存使用以平衡调用成本和内存要求.

由于您使用不同的值调用昂贵的函数x,x因此缓存最多多个返回值(每个唯一值),并在缓存已满时丢弃最近最少使用的缓存结果.

从Python 3.2开始,标准库附带了一个装饰器实现@functools.lru_cache():

from functools import lru_cache

@lru_cache(16)  # cache 16 different `x` return values
def costly_function_a(x):
    print("costly_function_a has been called.")
    return x #Dummy operation.

@lru_cache(32)  # cache 32 different `x` return values
def costly_function_b(x):
    print("costly_function_b has been called.")
    return 5.*x #Dummy operation.
Run Code Online (Sandbox Code Playgroud)

对于早期版本,可以使用backport,或者选择可以处理可在PyPI上使用的LRU高速缓存的其他可用库之一.

如果您只需要缓存一个最近的项目,请创建自己的装饰器:

from functools import wraps

def cache_most_recent(func):
    cache = [None, None]
    @wraps(func)
    def wrapper(*args, **kw):
        if (args, kw) == cache[0]:
            return cache[1]
        cache[0] = args, kw
        cache[1] = func(*args, **kw)
        return cache[1]
    return wrapper

@cache_most_recent
def costly_function_a(x):
    print("costly_function_a has been called.")
    return x #Dummy operation.

@cache_most_recent
def costly_function_b(x):
    print("costly_function_b has been called.")
    return 5.*x #Dummy operation.
Run Code Online (Sandbox Code Playgroud)

这个更简单的装饰器比更具特色的装饰器具有更少的开销functools.lru_cache().