del*_*358 5 python iterator numpy data-processing python-3.x
这是我在stackoverflow的第一个问题,因为我开始使用Python3编写脚本.
应用
我制作了一个Python3脚本,用于在LS-Dyna中为有限元模拟编写可移动热源的负载定义.作为源,我具有离散化的3D发热率密度(W/cm ^ 3)场,定义有限元网格的坐标和热场中心随时间的位置.作为输出,我得到一个依赖于时间的加热功率,在每个有限元的元素编号之后排序.这已经用于合理的尺寸(200000个有限元,热场的3000个位置,热场中的400000个数据点).
问题
对于较大的有限元网格(4 000 000个元素),我的内存不足(60GB RAM,python3 64Bit).为了进一步说明这个问题,我准备了一个独立运行的最小例子.它生成一些人工测试数据,我使用它的有限元网格(实际上它不是常规网格)和热应用的新位置的迭代器.
import numpy as np
import math
from scipy.interpolate import RegularGridInterpolator
def main():
dataCoordinateAxes,dataArray = makeTestData()
meshInformationArray = makeSampleMesh()
coordinates = makeSampleCoordinates()
interpolateOnMesh(dataCoordinateAxes,dataArray,meshInformationArray,coordinates)
def makeTestData():
x = np.linspace(-0.02,0.02,300)
y = np.linspace(-0.02,0.02,300)
z = np.linspace(-0.005,0.005,4)
data = f(*np.meshgrid(x,y,z,indexing='ij',sparse=True))
return (x,y,z),data
def f(x,y,z):
scaling = 1E18
sigmaXY = 0.01
muXY = 0
sigmaZ = 0.5
muZ = 0.005
return weight(x,1E-4,muXY,sigmaXY)*weight(y,1E-4,muXY,sigmaXY)*weight(z,0.1,muZ,sigmaZ)*scaling
def weight(x,dx,mu,sigma):
result = np.multiply(np.divide(np.exp(np.divide(np.square(np.subtract(x,mu)),(-2*sigma**2))),math.sqrt(2*math.pi*sigma**2.)),dx)
return result
def makeSampleMesh():
meshInformation = []
for x in np.linspace(-0.3,0.3,450):
for y in np.linspace(-0.3,0.3,450):
for z in np.linspace(-0.005,0.005,5):
meshInformation.append([x,y,z])
return np.array(meshInformation)
def makeSampleCoordinates():
x = np.linspace(-0.2,0.2,500)
y = np.sqrt(np.subtract(0.2**2,np.square(x)))
return (np.array([element[0],element[1],0])for element in zip(x,y))
Run Code Online (Sandbox Code Playgroud)
然后在此功能中完成插值.我删除了for循环中的所有内容以隔离问题.实际上,我将负载曲线导出为特定格式的文件.
def interpolateOnMesh(dataCoordinateAxes,dataArray,meshInformationArray,coordinates):
interpolationFunction = RegularGridInterpolator(dataCoordinateAxes, dataArray, bounds_error=False, fill_value=None)
for finiteElementNumber, heatGenerationCurve in enumerate(iterateOverFiniteElements(meshInformationArray, coordinates, interpolationFunction)):
pass
return
def iterateOverFiniteElements(meshInformationArray, coordinates, interpolationFunction):
meshDataIterator = (np.nditer(interpolationFunction(np.subtract(meshInformationArray,coordinateSystem))) for coordinateSystem in coordinates)
for heatGenerationCurve in zip(*meshDataIterator):
yield heatGenerationCurve
if __name__ == '__main__':
main()
Run Code Online (Sandbox Code Playgroud)
为了确定问题,我随着时间的推移跟踪了内存消耗. 内存消耗随着时间的推移 似乎对结果数组的迭代消耗了大量的内存.
题
是否有更少的内存消耗方式迭代数据点而不会失去太多性能?如果没有,我想我会将网格数组切成块并逐个插入.
到目前为止,我找到的唯一解决方案是剪切meshInformationArray. 这里是修改后的main()函数:
def main():
dataCoordinateAxes,dataArray = makeTestData()
meshInformationArray = makeSampleMesh()
coordinates = makeSampleCoordinates()
sections = int(meshInformationArray.shape[0] / 100000)
if sections == 0: sections = 1
for array in iter(np.array_split(meshInformationArray, sections, axis=0)):
interpolateOnMesh(dataCoordinateAxes,dataArray,array,coordinates)
Run Code Online (Sandbox Code Playgroud)