将单位(100kb、32MB)的大小表示法转换为 Python 中的字节数

Rob*_*son 2 python type-conversion

我需要使用 Python 将上传大小限制(例如 100kb、32MB 等)转换为人类可读的表示法。转换后的值应表示为字节数。

例子

convert_size_to_bytes("32MB") # should return 33554432
convert_size_to_bytes("100kB") # should return 102400
convert_size_to_bytes("123B") # should return 123
convert_size_to_bytes("123") # should return 123
Run Code Online (Sandbox Code Playgroud)

mli*_*ner 5

我喜欢@Robson 采取的简单方法,但我注意到它有许多无法处理的极端情况。下面的版本大致相同,但添加/修复了以下内容:

\n\n
    \n
  • 添加对字节、千字节等的支持。
  • \n
  • 提供对单数单位的支持,例如“1 字节”
  • \n
  • 添加对 yottabytes、zetabytes 等的支持。上面的版本会因这些而崩溃。
  • \n
  • 添加对输入之前、之后和输入中的空格的支持。
  • \n
  • 添加对浮点数的支持(“5.2 mb”)
  • \n
  • 更大的文档字符串
  • \n
\n\n

这是代码:

\n\n
def convert_size_to_bytes(size_str):\n    """Convert human filesizes to bytes.\n\n    Special cases:\n     - singular units, e.g., "1 byte"\n     - byte vs b\n     - yottabytes, zetabytes, etc.\n     - with & without spaces between & around units.\n     - floats ("5.2 mb")\n\n    To reverse this, see hurry.filesize or the Django filesizeformat template\n    filter.\n\n    :param size_str: A human-readable string representing a file size, e.g.,\n    "22 megabytes".\n    :return: The number of bytes represented by the string.\n    """\n    multipliers = {\n        \'kilobyte\':  1024,\n        \'megabyte\':  1024 ** 2,\n        \'gigabyte\':  1024 ** 3,\n        \'terabyte\':  1024 ** 4,\n        \'petabyte\':  1024 ** 5,\n        \'exabyte\':   1024 ** 6,\n        \'zetabyte\':  1024 ** 7,\n        \'yottabyte\': 1024 ** 8,\n        \'kb\': 1024,\n        \'mb\': 1024**2,\n        \'gb\': 1024**3,\n        \'tb\': 1024**4,\n        \'pb\': 1024**5,\n        \'eb\': 1024**6,\n        \'zb\': 1024**7,\n        \'yb\': 1024**8,\n    }\n\n    for suffix in multipliers:\n        size_str = size_str.lower().strip().strip(\'s\')\n        if size_str.lower().endswith(suffix):\n            return int(float(size_str[0:-len(suffix)]) * multipliers[suffix])\n    else:\n        if size_str.endswith(\'b\'):\n            size_str = size_str[0:-1]\n        elif size_str.endswith(\'byte\'):\n            size_str = size_str[0:-4]\n    return int(size_str)\n
Run Code Online (Sandbox Code Playgroud)\n\n

我为我们正在抓取的值编写了一系列测试:

\n\n
class TestFilesizeConversions(TestCase):\n\n    def test_filesize_conversions(self):\n        """Can we convert human filesizes to bytes?"""\n        qa_pairs = [\n            (\'58 kb\', 59392),\n            (\'117 kb\', 119808),\n            (\'117kb\', 119808),\n            (\'1 byte\', 1),\n            (\'117 bytes\', 117),\n            (\'117  bytes\', 117),\n            (\'  117 bytes  \', 117),\n            (\'117b\', 117),\n            (\'117bytes\', 117),\n            (\'1 kilobyte\', 1024),\n            (\'117 kilobytes\', 119808),\n            (\'0.7 mb\', 734003),\n            (\'1mb\', 1048576),\n            (\'5.2 mb\', 5452595),\n        ]\n        for qa in qa_pairs:\n            print("Converting \'%s\' to bytes..." % qa[0], end=\'\')\n            self.assertEqual(convert_size_to_bytes(qa[0]), qa[1])\n            print(\'\xe2\x9c\x93\')\n
Run Code Online (Sandbox Code Playgroud)\n