当我使用生成器表达式插入数据并从网格上的常规网格> 2000次插入迭代器时,我收到一个MemError

del*_*358 5 python iterator numpy data-processing python-3.x

这是我在stackoverflow的第一个问题,因为我开始使用Python3编写脚本.

应用

我制作了一个Python3脚本,用于在LS-Dyna中为有限元模拟编写可移动热源的负载定义.作为源,我具有离散化的3D发热率密度(W/cm ^ 3)场,定义有限元网格的坐标和热场中心随时间的位置.作为输出,我得到一个依赖于时间的加热功率,在每个有限元的元素编号之后排序.这已经用于合理的尺寸(200000个有限元,热场的3000个位置,热场中的400000个数据点).

问题

对于较大的有限元网格(4 000 000个元素),我的内存不足(60GB RAM,python3 64Bit).为了进一步说明这个问题,我准备了一个独立运行的最小例子.它生成一些人工测试数据,我使用它的有限元网格(实际上它不是常规网格)和热应用的新位置的迭代器.

import numpy as np
import math
from scipy.interpolate import RegularGridInterpolator
def main():
    dataCoordinateAxes,dataArray = makeTestData()
    meshInformationArray = makeSampleMesh()
    coordinates = makeSampleCoordinates()
    interpolateOnMesh(dataCoordinateAxes,dataArray,meshInformationArray,coordinates)

def makeTestData():
    x = np.linspace(-0.02,0.02,300)
    y = np.linspace(-0.02,0.02,300)
    z = np.linspace(-0.005,0.005,4)
    data = f(*np.meshgrid(x,y,z,indexing='ij',sparse=True))
    return (x,y,z),data

def f(x,y,z):
    scaling = 1E18
    sigmaXY = 0.01
    muXY = 0
    sigmaZ = 0.5
    muZ = 0.005
    return weight(x,1E-4,muXY,sigmaXY)*weight(y,1E-4,muXY,sigmaXY)*weight(z,0.1,muZ,sigmaZ)*scaling

def weight(x,dx,mu,sigma):
    result = np.multiply(np.divide(np.exp(np.divide(np.square(np.subtract(x,mu)),(-2*sigma**2))),math.sqrt(2*math.pi*sigma**2.)),dx)
    return result

def makeSampleMesh():
    meshInformation = []
    for x in np.linspace(-0.3,0.3,450):
        for y in np.linspace(-0.3,0.3,450):
            for z in np.linspace(-0.005,0.005,5):
                meshInformation.append([x,y,z])
    return np.array(meshInformation)

def makeSampleCoordinates():
    x = np.linspace(-0.2,0.2,500)
    y = np.sqrt(np.subtract(0.2**2,np.square(x)))
    return (np.array([element[0],element[1],0])for element in zip(x,y))
Run Code Online (Sandbox Code Playgroud)

然后在此功能中完成插值.我删除了for循环中的所有内容以隔离问题.实际上,我将负载曲线导出为特定格式的文件.

def interpolateOnMesh(dataCoordinateAxes,dataArray,meshInformationArray,coordinates):
    interpolationFunction = RegularGridInterpolator(dataCoordinateAxes, dataArray, bounds_error=False, fill_value=None)
    for finiteElementNumber, heatGenerationCurve in enumerate(iterateOverFiniteElements(meshInformationArray, coordinates, interpolationFunction)):
        pass
    return

def iterateOverFiniteElements(meshInformationArray, coordinates, interpolationFunction):
    meshDataIterator = (np.nditer(interpolationFunction(np.subtract(meshInformationArray,coordinateSystem))) for coordinateSystem in coordinates)
    for heatGenerationCurve in zip(*meshDataIterator):
        yield heatGenerationCurve

if __name__ == '__main__':
    main()
Run Code Online (Sandbox Code Playgroud)

为了确定问题,我随着时间的推移跟踪了内存消耗. 内存消耗随着时间的推移 似乎对结果数组的迭代消耗了大量的内存.

是否有更少的内存消耗方式迭代数据点而不会失去太多性能?如果没有,我想我会将网格数组切成块并逐个插入.

del*_*358 0

到目前为止,我找到的唯一解决方案是剪切meshInformationArray. 这里是修改后的main()函数:

def main():
    dataCoordinateAxes,dataArray = makeTestData()
    meshInformationArray = makeSampleMesh()
    coordinates = makeSampleCoordinates()
    sections = int(meshInformationArray.shape[0] / 100000)
    if sections == 0: sections = 1
    for array in iter(np.array_split(meshInformationArray, sections, axis=0)):
        interpolateOnMesh(dataCoordinateAxes,dataArray,array,coordinates)
Run Code Online (Sandbox Code Playgroud)