如何加速数百万对象的python实例初始化?

ted*_*511 4 python performance instance

我已经定义了一个class名为Edge如下的python :

class Edge:
    def __init__(self):
        self.node1 = 0
        self.node2 = 0
        self.weight = 0
Run Code Online (Sandbox Code Playgroud)

现在我必须使用以下方法创建大约10 ^ 6到10 ^ 7个Edge实例:

edges= []
for (i,j,w) in ijw:
    edge = Edge()
    edge.node1 = i
    edge.node2 = j
    edge.weight = w
    edges.append(edge)
Run Code Online (Sandbox Code Playgroud)

我在桌面上花了大约2秒钟.有没有更快的方法呢?

Mar*_*ers 10

你不能让它快,但我肯定会使用__slots__,以节省内存分配.还可以在创建实例时传递属性值:

class Edge:
    __slots__ = ('node1', 'node2', 'weight')
    def __init__(self, node1=0, node2=0, weight=0):
        self.node1 = node1
        self.node2 = node2
        self.weight = weight
Run Code Online (Sandbox Code Playgroud)

通过更新,__init__您可以使用列表理解:

edges = [Edge(*args) for args in ijw]
Run Code Online (Sandbox Code Playgroud)

这些可以减少创建对象的大量时间,大致减少所需的时间.

比较创建了100万个对象; 设置:

>>> from random import randrange
>>> ijw = [(randrange(100), randrange(100), randrange(1000)) for _ in range(10 ** 6)]
>>> class OrigEdge:
...     def __init__(self):
...         self.node1 = 0
...         self.node2 = 0
...         self.weight = 0
...
>>> origloop = '''\
... edges= []
... for (i,j,w) in ijw:
...     edge = Edge()
...     edge.node1 = i
...     edge.node2 = j
...     edge.weight = w
...     edges.append(edge)
... '''
>>> class SlotsEdge:
...     __slots__ = ('node1', 'node2', 'weight')
...     def __init__(self, node1=0, node2=0, weight=0):
...         self.node1 = node1
...         self.node2 = node2
...         self.weight = weight
...
>>> listcomploop = '''[Edge(*args) for args in ijw]'''
Run Code Online (Sandbox Code Playgroud)

和时间:

>>> from timeit import Timer
>>> count, total = Timer(origloop, 'from __main__ import OrigEdge as Edge, ijw').autorange()
>>> (total / count) * 1000 # milliseconds
722.1121070033405
>>> count, total = Timer(listcomploop, 'from __main__ import SlotsEdge as Edge, ijw').autorange()
>>> (total / count) * 1000 # milliseconds
386.6706900007557
Run Code Online (Sandbox Code Playgroud)

这几乎快了2倍.

将随机输入列表增加到10 ^ 7项,并且时间差保持:

>>> ijw = [(randrange(100), randrange(100), randrange(1000)) for _ in range(10 ** 7)]
>>> count, total = Timer(origloop, 'from __main__ import OrigEdge as Edge, ijw').autorange()
>>> (total / count)
7.183759553998243
>>> count, total = Timer(listcomploop, 'from __main__ import SlotsEdge as Edge, ijw').autorange()
>>> (total / count)
3.8709938440006226
Run Code Online (Sandbox Code Playgroud)