C# 和 Python 中的 JPEG 压缩差异

Question

C# 和 Python 中的 JPEG 压缩差异

J. *_*Mac 6 c# python compression jpeg image

我将一些图像处理功能从 .NET 迁移到 Python，但条件是输出图像必须以与 .NET 中完全相同的方式进行压缩。但是，当我在文本比较.jpg等工具上比较输出文件时和 choice 等Ignore nothing时，发现文件的压缩方式存在显着差异。

例如：

Python

bmp = PIL.Image.open('marbles.bmp')

bmp.save(
    'output_python.jpg',
    format='jpeg',
    dpi=(300,300),
    subsampling=2,
    quality=75
)

Run Code Online (Sandbox Code Playgroud)

。网

ImageCodecInfo jgpEncoder = ImageCodecInfo.GetImageDecoders().First(codec => codec.FormatID == ImageFormat.Jpeg.Guid);
EncoderParameters myEncoderParameters = new EncoderParameters(1);
myEncoderParameters.Param[0] = new EncoderParameter(Encoder.Quality, 75L);

Bitmap bmp = new Bitmap(directory + "marbles.bmp");

bmp.Save(directory + "output_net.jpg", jgpEncoder, myEncoderParameters);

Run Code Online (Sandbox Code Playgroud)

exiftool output_python.jpg -a -G1 -w txt

[ExifTool]      ExifTool Version Number         : 12.31
[System]        File Name                       : output_python.jpg
[System]        Directory                       : .
[System]        File Size                       : 148 KiB
[System]        File Modification Date/Time     : 2021:09:28 09:19:20-06:00
[System]        File Access Date/Time           : 2021:09:28 09:19:21-06:00
[System]        File Creation Date/Time         : 2021:09:27 21:33:35-06:00
[System]        File Permissions                : -rw-rw-rw-
[File]          File Type                       : JPEG
[File]          File Type Extension             : jpg
[File]          MIME Type                       : image/jpeg
[File]          Image Width                     : 1419
[File]          Image Height                    : 1001
[File]          Encoding Process                : Baseline DCT, Huffman coding
[File]          Bits Per Sample                 : 8
[File]          Color Components                : 3
[File]          Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
[JFIF]          JFIF Version                    : 1.01
[JFIF]          Resolution Unit                 : inches
[JFIF]          X Resolution                    : 300
[JFIF]          Y Resolution                    : 300
[Composite]     Image Size                      : 1419x1001
[Composite]     Megapixels                      : 1.4

Run Code Online (Sandbox Code Playgroud)

exiftool output_net.jpg -a -G1 -w txt

[ExifTool]      ExifTool Version Number         : 12.31
[System]        File Name                       : output_net.jpg
[System]        Directory                       : .
[System]        File Size                       : 147 KiB
[System]        File Modification Date/Time     : 2021:09:28 09:18:05-06:00
[System]        File Access Date/Time           : 2021:09:28 09:18:52-06:00
[System]        File Creation Date/Time         : 2021:09:27 21:32:19-06:00
[System]        File Permissions                : -rw-rw-rw-
[File]          File Type                       : JPEG
[File]          File Type Extension             : jpg
[File]          MIME Type                       : image/jpeg
[File]          Image Width                     : 1419
[File]          Image Height                    : 1001
[File]          Encoding Process                : Baseline DCT, Huffman coding
[File]          Bits Per Sample                 : 8
[File]          Color Components                : 3
[File]          Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
[JFIF]          JFIF Version                    : 1.01
[JFIF]          Resolution Unit                 : inches
[JFIF]          X Resolution                    : 300
[JFIF]          Y Resolution                    : 300
[Composite]     Image Size                      : 1419x1001
[Composite]     Megapixels                      : 1.4

Run Code Online (Sandbox Code Playgroud)

弹珠.bmp 示例图像

文本比较的差异

问题

假设这两种 JPEG 压缩实现可以产生相同的输出文件是否合理？
如果是的话，是PIL或者System.Drawing.Image任何额外的步骤（例如抗锯齿）会导致结果不同？
或者是否有其他参数可以PIL .save()使其行为更像 C# 中的 JPEG 编码器？

谢谢

更新

根据Jeremy 的建议，我使用JPEGsnoop比较了文件之间的更多细节，发现亮度和色度表不同。我修改了代码：

bmp = PIL.Image.open('marbles.bmp')

output_net = PIL.Image.open('output_net.jpg')

bmp.save(
    'output_python.jpg',
    format='jpeg',
    dpi=(300,300),
    subsampling=2,
    qtables=output_net.quantization,
    #quality=75
)

Run Code Online (Sandbox Code Playgroud)

现在表是相同的，但文件之间的差异没有变化。JPEGsnoop 现在显示的唯一区别在于Compression stats和Huffman code histogram stats。

output_net.jpeg

*** Decoding SCAN Data ***
  OFFSET: 0x0000026F
  Scan Decode Mode: Full IDCT (AC + DC)

  Scan Data encountered marker   0xFFD9 @ 0x00024BE7.0

  Compression stats:
    Compression Ratio: 28.43:1
    Bits per pixel:     0.84:1

  Huffman code histogram stats:
    Huffman Table: (Dest ID: 0, Class: DC)
      # codes of length 01 bits:        0 (  0%)
      # codes of length 02 bits:     1664 (  7%)
      # codes of length 03 bits:    18238 ( 81%)
      # codes of length 04 bits:     1807 (  8%)
      # codes of length 05 bits:      715 (  3%)
      # codes of length 06 bits:        4 (  0%)
      # codes of length 07 bits:        0 (  0%)
      ...

Run Code Online (Sandbox Code Playgroud)

output_python.jpg

*** Decoding SCAN Data ***
  OFFSET: 0x0000026F
  Scan Decode Mode: Full IDCT (AC + DC)

  Scan Data encountered marker   0xFFD9 @ 0x00025158.0

  Compression stats:
    Compression Ratio: 28.17:1
    Bits per pixel:     0.85:1

  Huffman code histogram stats:
    Huffman Table: (Dest ID: 0, Class: DC)
      # codes of length 01 bits:        0 (  0%)
      # codes of length 02 bits:     1659 (  7%)
      # codes of length 03 bits:    18247 ( 81%)
      # codes of length 04 bits:     1807 (  8%)
      # codes of length 05 bits:      711 (  3%)
      # codes of length 06 bits:        4 (  0%)
      # codes of length 07 bits:        0 (  0%)
      ...

Run Code Online (Sandbox Code Playgroud)

我现在正在寻找一种通过同步这些值的方法PIL。

Answer 1

小智 2

假设这两种 JPEG 压缩实现可以产生相同的输出文件是否合理？

答案是否定的。

JPEG 压缩的要点是高压缩且有损失。即使质量设置为 100，损失也是不可避免的，因为算法需要无限的精度来精确复制源图像。

如果两种算法使用相同的参数进行相同的编码，则可以生成相同的文件：精度、边界选择和填充/偏移规范，以为 FFT 提供 2 倍大小的幂。

JPEG算法的实现可以使用预通道来优化算法的参数。

鉴于两种实现之间的参数优化不同，输出不太可能相同。

PIL .save() 是否有其他参数，使其行为更像 C# 中的 JPEG 编码器？

我无法直接回答这个问题，但是，您可以使用Python for.NET包从 Python 访问 C# JPEG 编码器。该解决方案将提供一致的相同结果。

除了教育价值之外，为什么有人需要二进制兼容性？

在我认为解决该问题的所有实际场景中，唯一需要的是保存图像的附加哈希：将新哈希保存在单独的字段中。

选择一种技术并使用它，直到它不再满足您的需要/要求。如果没有（最好是在之前），请找到垫片来填补空白并重写代码以利用新技术。

归档时间：	4 年，1 月前
查看次数：	837 次
最近记录：	4 年前