用python读取二进制文件

Question

用python读取二进制文件

我发现使用Python读取二进制文件特别困难.你能帮我个忙吗？我需要阅读这个文件,它在Fortran 90中很容易阅读

int*4 n_particles, n_groups
real*4 group_id(n_particles)
read (*) n_particles, n_groups
read (*) (group_id(j),j=1,n_particles)

Run Code Online (Sandbox Code Playgroud)

具体来说,文件格式为:

Bytes 1-4 -- The integer 8.
Bytes 5-8 -- The number of particles, N.
Bytes 9-12 -- The number of groups.
Bytes 13-16 -- The integer 8.
Bytes 17-20 -- The integer 4*N.
Next many bytes -- The group ID numbers for all the particles.
Last 4 bytes -- The integer 4*N.

Run Code Online (Sandbox Code Playgroud)

我怎么用Python阅读？我尝试了一切,但从未奏效.我有没有机会在python中使用f90程序,读取这个二进制文件,然后保存我需要使用的数据？

Answer 1

gec*_*cco 130

像这样读取二进制文件内容:

with open(fileName, mode='rb') as file: # b is important -> binary
    fileContent = file.read()

Run Code Online (Sandbox Code Playgroud)

然后使用struct.unpack "解包"二进制数据:

起始字节: struct.unpack("iiiii", fileContent[:20])

正文:忽略标题字节和尾随字节(= 24); 剩下的部分形成了主体,要知道正文中的字节数做整数除以4; 获得的商乘以字符串'i'以创建解包方法的正确格式:

struct.unpack("i" * ((len(fileContent) -24) // 4), fileContent[20:-4])

Run Code Online (Sandbox Code Playgroud)

结束字节: struct.unpack("i", fileContent[-4:])

Answer 2

unw*_*ind 24

一般来说,我建议您考虑使用Python的struct模块.它是Python的标准,将问题的规范转换为适合的格式字符串应该很容易struct.unpack().

请注意,如果字段之间/周围存在"不可见"填充,则需要将其弄清楚并将其包含在unpack()调用中,否则您将读取错误的位.

读取文件的内容以便解压缩是非常简单的:

import struct

data = open("from_fortran.bin", "rb").read()

(eight, N) = struct.unpack("@II", data)

Run Code Online (Sandbox Code Playgroud)

这将解压缩前两个字段,假设它们从文件的最开头(无填充或无关数据)开始,并且还假设为本机字节顺序(@符号).I格式化字符串中的s表示"无符号整数,32位".

Answer 3

Chr*_*ris 12

您可以使用numpy.fromfile,它可以读取文本和二进制文件中的数据.您将首先构建一个表示文件格式的数据类型,numpy.dtype然后使用,然后从文件中读取此类型numpy.fromfile.

很容易错过这个！文档有点薄；请参阅 https://www.reddit.com/r/Python/comments/19q8nt/psa_think_using_numpy_if_you_need_to_parse_a/ 进行一些讨论 (2认同)

Answer 4

Eug*_*ash 5

要将二进制文件读取到bytes对象：

from pathlib import Path
data = Path('/path/to/file').read_bytes()  # Python 3.5+

Run Code Online (Sandbox Code Playgroud)

要从int字节0-3 创建数据：

i = int.from_bytes(data[:4], byteorder='little', signed=False)

Run Code Online (Sandbox Code Playgroud)

要从int数据中解压缩多个：

import struct
ints = struct.unpack('iiii', data[:16])

Run Code Online (Sandbox Code Playgroud)

归档时间：	14 年，1 月前
查看次数：	210269 次
最近记录：	7 年，3 月前