GSh*_*ked 5 python replace out-of-memory large-files
我有一个~600MB的Roblox类型.mesh文件,它在任何文本编辑器中都像文本文件一样读取.我有以下代码:
mesh = open("file.mesh", "r").read()
mesh = mesh.replace("[", "{").replace("]", "}").replace("}{", "},{")
mesh = "{"+mesh+"}"
f = open("p2t.txt", "w")
f.write(mesh)
Run Code Online (Sandbox Code Playgroud)
它返回:
Traceback (most recent call last):
File "C:\TheDirectoryToMyFile\p2t2.py", line 2, in <module>
mesh = mesh.replace("[", "{").replace("]", "}").replace("}{", "},{")
MemoryError
Run Code Online (Sandbox Code Playgroud)
以下是我的文件示例:
[-0.00599, 0.001466, 0.006][0.16903, 0.84515, 0.50709][0.00000, 0.00000, 0][-0.00598, 0.001472, 0.00599][0.09943, 0.79220, 0.60211][0.00000, 0.00000, 0]
Run Code Online (Sandbox Code Playgroud)
我能做什么?
编辑:
我不确定head,follow和tail命令在那个标记为重复的其他线程中是什么.我试图使用它,但无法让它工作.该文件也是一条巨行,它不会分成几行.
您需要每次迭代读取一小部分,对其进行分析,然后写入另一个文件或sys.stdout. 试试这个代码:
mesh = open("file.mesh", "r")\nmesh_out = open("file-1.mesh", "w")\n\nc = mesh.read(1)\n\nif c:\n mesh_out.write("{")\nelse:\n exit(0)\nwhile True:\n c = mesh.read(1)\n if c == "":\n break\n\n if c == "[":\n mesh_out.write(",{")\n elif c == "]":\n mesh_out.write("}")\n else:\n mesh_out.write\xc2\xa9\nRun Code Online (Sandbox Code Playgroud)\n\n更新:
\n\n它运行速度非常慢(感谢 jamylak)。所以我改变了它:
\n\nimport sys\nimport re\n\n\ndef process_char(c, stream, is_first=False):\n if c == \'\':\n return False\n if c == \'[\':\n stream.write(\'{\' if is_first else \',{\')\n return True\n if c == \']\':\n stream.write(\'}\')\n return True\n\n\ndef process_file(fname):\n with open(fname, "r") as mesh:\n c = mesh.read(1)\n if c == \'\':\n return\n sys.stdout.write(\'{\')\n\n while True:\n c = mesh.read(8192)\n if c == \'\':\n return\n\n c = re.sub(r\'\\[\', \',{\', c)\n c = re.sub(r\'\\]\', \'}\', c)\n sys.stdout.write(c)\n\n\nif __name__ == \'__main__\':\n process_file(sys.argv[1])\nRun Code Online (Sandbox Code Playgroud)\n\n现在它在处理 1.4G 文件时大约需要 15 秒。运行它:
\n\n$ python mesh.py file.mesh > file-1.mesh\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
3373 次 |
| 最近记录: |