标签: quantization

在TensorFlow中训练量化模型

我想训练量化网络,即在前向传递期间使用量化权重来计算损耗,然后在后向传递期间更新基础全精度浮点权重.

这个问题已经在这里提出但没有回答.

请注意,在我的情况下,"假量化"就足够了.这意味着权重仍然可以存储为32位浮点值,只要它们代表低位宽量化值即可.

在Pete Warden的博客文章中,他指出:

"[...]我们确实支持"虚假量化"运算符.如果你在图表中包含预期量化的点(例如在卷积之后),那么在正向传递中浮点值将是舍入到指定的级别数(通常为256)以模拟量化效果."

提到的运算符可以在TensorFlow API中找到.

谁能指出我如何使用这些功能？如果我在模型定义中的例如转换层之后调用它们,为什么要量化层中的权重而不是该层的输出(激活)？

quantization deep-learning tensorflow

ste*_*lin

lucky-day

6
推荐指数

0
解决办法

766
查看次数

语音处理中的矢量量化解释

我无法从本研究论文中确切地知道如何根据训练数据集再现标准矢量量化算法来确定未识别语音输入的语言.这是一些基本信息:

摘要信息 使用声学特征的语言识别(例如日语,英语,德语等)是当前语音技术的重要但难以解决的问题....本文使用的语音数据库包含20种语言:16个句子由4个男性和4个女性发出两次.每个句子的持续时间约为8秒.第一种算法基于标准矢量量化(VQ)技术.每种语言都有自己的VQ码本, $替代文字$ .

识别算法 第一种算法基于标准矢量量化(VQ)技术.每种语言k都有自己的VQ码本, $替代文字$ .在识别阶段,输入语音被量化 $替代文字$ 并且计算累积的量化失真d_k.识别最小失真的语言.计算VQ失真,应用了几种LPC光谱失真度量......在这种情况下,WLR - 加权最小比率 - 距离:http://tinyurl.com/yc52gcl.

标准VQ算法:使用训练句子生成每种语言的码本,alt文本http://tinyurl.com/y8csx6e.输入向量在句子中的累积距离, $替代文字$ ,定义为:alt文本http://tinyurl.com/ybynjc2

距离d可以是对应于声学特征的任何距离,并且它必须与用于码本生成的距离相同.每种语言都以其VQ码本为特征, $替代文字$ .

我的问题是,我到底该怎么做？我有一套50个英文句子.在MATLAB中,我可以轻松计算任何给定信号的WLR.但是,我如何制定一个码本,因为我必须使用WLR为英语的"码本生成".我也很好奇如何将大小为16的VQ码本(被发现是最佳大小)与给定的输入信号进行比较.如果有人能帮我提取这篇论文,我会非常感激.

谢谢!

vector speech quantization audio-processing

atp*_*atp

2017 02-08

5
推荐指数

1
解决办法

2574
查看次数

我正在使用 NeuQuant 量化算法（https://code.google.com/p/android-gif-project/source/browse/trunk/GIFproject1/src/com/ui/NeuQuant.java?r=5）来将 jpeg 缩小为 256 色图像，但速度非常慢（320x240 图像约 1 秒，640x480 约 3 秒）。即使使用多个线程，我也无法将处理时间提高到合适的水平（理想情况是每个图像 100 毫秒范围内）。

有谁知道一种更快的算法可以将图像的调色板减少到 256 种颜色？

algorithm performance android quantization color-palette

Cat*_*Cat

lucky-day

5
推荐指数

1
解决办法

2997
查看次数

Java 中 JPEG 压缩的自定义量化表

正如标题所示，我正在尝试使用自定义量化表来压缩 JPEG 格式的图像。我的问题是无法打开生成的文件，错误是：

Quantization table 0x00 was not defined

Run Code Online (Sandbox Code Playgroud)

这就是我的代码的样子：

        JPEGImageWriteParam params = new JPEGImageWriteParam(null);
        if (mQMatrix != null) {
            JPEGHuffmanTable[] huffmanDcTables = {JPEGHuffmanTable.StdDCLuminance, JPEGHuffmanTable.StdDCChrominance};
            JPEGHuffmanTable[] huffmanAcTables = {JPEGHuffmanTable.StdACLuminance, JPEGHuffmanTable.StdACChrominance};
            dumpMatrices(mQMatrix);
            params.setEncodeTables(mQMatrix, huffmanDcTables, huffmanAcTables);
        }

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

        Iterator writers = ImageIO.getImageWritersByFormatName("JPEG");
        ImageWriter imageWriter = (ImageWriter) writers.next();

        ImageOutputStream imageOutputStream = ImageIO.createImageOutputStream(outputStream);
        imageWriter.setOutput(imageOutputStream);
        imageWriter.write(null, new IIOImage(mSourceImage, null, null), params);

        mCompressedImageSize = outputStream.size();

        try (FileOutputStream fileOutputStream = new FileOutputStream(mOutFileName)) {
            fileOutputStream.write(outputStream.toByteArray());

        }
        mCompressedImage = ImageIO.read(new ByteArrayInputStream(outputStream.toByteArray()));

Run Code Online (Sandbox Code Playgroud)

我的猜测是它与元数据有关，但我没有找到解决方案。

谢谢，R。

更新：使用十六进制查看器，我确定量化表（DQT - …

java jpeg quantization

rho*_*ncu

2014 07-30

5
推荐指数

1
解决办法

2710
查看次数

如何使用python将浮点数转换为具有预定义位数的定点

我有 numpy 格式的浮动 32 个数字（假设是正数）。我想将它们转换为具有预定义位数的定点数以降低精度。

例如，在 matlab 中使用函数 num2fixpt 将数字 3.1415926 变为 3.25。命令是 num2fixpt(3.1415926,sfix(5),2^(1 + 2-5), 'Nearest','on') 表示整数部分为 3 位，小数部分为 2 位。

我可以用 Python 做同样的事情吗

python fixed-point quantization

tum*_*990

lucky-day

5
推荐指数

1
解决办法

6812
查看次数

倒置多索引

我试图从这篇论文中理解反向多索引，这里也有一个较小的版本。为此，我构建了一个玩具示例，并希望有人验证或/并与我分享他/她的意见。

这个例子：

Assume we have N = 6 points in M = 4 dimensions. We use two blocks to
create sub-vecrtors. Let the points be these:

p0 = (0, 0, 1, 1)
p1 = (2, 2, 3, 3)
p2 = (4, 4, 5, 5)
p3 = (6, 6, 7, 7)
p4 = (8, 8, 9, 9)
p5 = (10, 10, 11, 11) // p5^1 = (10, 10), which is appended in D^1 etc.

_________________________________________________________________________

We run …

Run Code Online (Sandbox Code Playgroud)

algorithm search quantization nearest-neighbor inverted-index

gsa*_*ras

2016 07-22

5
推荐指数

1
解决办法

929
查看次数

训练后量化权重的 keras 模型评估

我有一个在 keras 中训练的模型并保存为 .h5 文件。该模型使用 tensorflow 后端的单精度浮点值进行训练。现在我想实现一个在 Xilinx FPGA 上执行卷积运算的硬件加速器。但是，在我决定在 FPGA 上使用的定点位宽之前，我需要通过将权重量化为 8 位或 16 位数字来评估模型的准确性。我遇到了tensorflow quantise，但我不确定如何从每一层获取权重，对其进行量化并将其存储在 numpy 数组列表中。在所有层都量化后，我想将模型的权重设置为新形成的量化权重。有人可以帮我做这个吗？

这是我迄今为止尝试将精度从 float32 降低到 float16 的方法。请让我知道这是否是正确的方法。

for i in range(len(w_orginal)):
temp_shape = w_orginal[i].shape
print('Shape of index: '+ str(i)+ 'array is :')
print(temp_shape)
temp_array = w_orginal[i]
temp_array_flat = w_orginal[i].flatten()
for j in range(len(temp_array)):
    temp_array_flat[j] = temp_array_flat[j].astype(np.float16)

temp_array_flat = temp_array_flat.reshape(temp_shape)
w_fp_16_test.append(temp_array_flat)

Run Code Online (Sandbox Code Playgroud)

python fpga quantization keras tensorflow

fri*_*989

lucky-day

5
推荐指数

1
解决办法

1115
查看次数

interpreter.get_input_details() 中的“量化”是什么意思？

使用 tflite 并获取解释器的属性，例如：

print(interpreter.get_input_details())

[{'name': 'input_1_1', 'index': 47, 'shape': array([  1, 128, 128,   3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.003921568859368563, 0)}]

Run Code Online (Sandbox Code Playgroud)

什么'quantization': (0.003921568859368563, 0)意思？

python quantization tensorflow tensorflow-lite

mrg*_*oom

lucky-day

5
推荐指数

1
解决办法

704
查看次数

TensorFlow 伪量化层也从 TF-Lite 中调用

我正在使用 TensorFlow 2.1 来训练具有量化感知训练的模型。

这样做的代码是：

import tensorflow_model_optimization as tfmot
model = tfmot.quantization.keras.quantize_annotate_model(model)

Run Code Online (Sandbox Code Playgroud)

这将向图中添加假量化节点。这些节点应该调整模型的权重，以便它们更容易被量化为 int8 并使用 int8 数据。

训练结束后，我将模型转换并量化为 TF-Lite，如下所示：

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = [give data provider]
quantized_tflite_model = converter.convert()

Run Code Online (Sandbox Code Playgroud)

在这一点上，我不希望在 TL-Lite 图中看到假量化层。但令人惊讶的是，我确实看到了它们。此外，当我在 TF-Lite C++示例应用程序中运行这个量化模型时，我发现它在推理过程中也在运行假量化节点。除此之外，它还对每层之间的激活进行反量化和量化。

这是 C++ 代码的输出示例：

节点 0 运算符内置代码 80 FAKE_QUANT
输入：1
输出：237
节点 1 运算符内置代码 114 QUANTIZE
输入：237
输出：238
节点 2 运算符内置代码 3 CONV_2D
输入：238 59 58
输出：167 运算符内置代码 6
临时代码：237
DEQUANTIZE
输入：167
输出：239
节点 4 运算符内置代码 80 FAKE_QUANT
输入：239
输出：166 …

quantization tensorflow tensorflow-lite quantization-aware-training

Oha*_*eir

lucky-day

5
推荐指数

1
解决办法

946
查看次数

使用 Huggingface 变压器进行动态量化时出现“未找到量化操作引擎”错误

我正在尝试对 Huggingface 库中的 pytorch 预训练模型进行动态量化（量化权重和激活）。我已经参考了此链接，发现动态量化最合适。我将在 CPU 上使用量化模型。

链接到这里的拥抱模型。

火炬版本：1.6.0（通过pip安装）

预训练模型

tokenizer = AutoTokenizer.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext")
model = AutoModel.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext")

Run Code Online (Sandbox Code Playgroud)

动态量化

quantized_model = torch.quantization.quantize_dynamic(
    model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
)

print(quantized_model)

Run Code Online (Sandbox Code Playgroud)

错误

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-7-df2355c17e0b> in <module>
      1 quantized_model = torch.quantization.quantize_dynamic(
----> 2     model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
      3 )
      4 
      5 print(quantized_model)

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in quantize_dynamic(model, qconfig_spec, dtype, mapping, inplace)
    283     model.eval()
    284     propagate_qconfig_(model, qconfig_spec)
--> 285     convert(model, mapping, inplace=True)
    286     _remove_qconfig(model)
    287     return model

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, …

Run Code Online (Sandbox Code Playgroud)

quantization deep-learning pytorch

joe*_*oel

2022 06-27

5
推荐指数

1
解决办法

1546
查看次数