是否有一种算法可以快速将大量十六进制字符串转换为字节流?汇编/C/C++

dAE*_*527 1 c c++ optimization assembly hex

这是我当前的代码:

//Input:hex string , 1234ABCDEEFF0505DDCC ....
//Output?BYTE stream
void HexString2Hex(/*IN*/ char* hexstring, /*OUT*/  BYTE* hexBuff)
{
    for (int i = 0; i < strlen(hexstring); i += 2)
    {
        BYTE val = 0;
        if (hexstring[i] < 'A')
            val += 0x10 * (hexstring[i] - '0');
        else
            val += 0xA0 + 0x10 * (hexstring[i] - 'A');

        if (hexstring[i+1] < 'A')
            val += hexstring[i + 1] - '0';
        else
            val += 0xA + hexstring[i + 1] - 'A';

        hexBuff[i / 2] = val;
    }
}
Run Code Online (Sandbox Code Playgroud)

问题是:当输入的十六进制字符串非常大(比如长度为 1000000)时,这个函数需要一百秒,这对我来说是不可接受的。(CPU:i7-8700,3.2GHz。内存:32G)

那么,是否有任何替代算法可以更快地完成工作?

谢谢你们

编辑 1:感谢稻田的评论。我太粗心了,没有注意到 strlen(time:O(n)) 被执行了数百次。所以我原来的函数是 O(n*n) 这太可怕了。

更新后的代码如下:

int len=strlen(hexstring);
for (int i = 0; i < len; i += 2)
Run Code Online (Sandbox Code Playgroud)

而且,对于 Emanuel P 的建议,我尝试过,但似乎不太好。下面是我的代码

map<string, BYTE> by_map;
Run Code Online (Sandbox Code Playgroud)
//init table (map here)
char *xx1 = "0123456789ABCDEF";
    for (int i = 0; i < 16;i++)
    {
        for (int j = 0; j < 16; j++)
        {
            
            _tmp[0] = xx1[i];
            _tmp[1] = xx1[j];

            BYTE val = 0;
            if (xx1[i] < 'A')
                val += 0x10 * (xx1[i] - '0');
            else
                val += 0xA0 + 0x10 * (xx1[i] - 'A');

            if (xx1[j] < 'A')
                val += xx1[j] - '0';
            else
                val += 0xA + xx1[j] - 'A';

            by_map.insert(map<string, BYTE>::value_type(_tmp, val));
        }
    }
Run Code Online (Sandbox Code Playgroud)
//search map
void HexString2Hex2(char* hexstring, BYTE* hexBuff)
{
    char _tmp[3] = { 0 };
    for (int i = 0; i < strlen(hexstring); i += 2)
    {
        _tmp[0] = hexstring[i];
        _tmp[1] = hexstring[i + 1];
        //DWORD dw = 0;
        //sscanf(_tmp, "%02X", &dw);
        hexBuff[i / 2] = by_map[_tmp];
    }
}
Run Code Online (Sandbox Code Playgroud)

Edit2:事实上,当我修复 strlen 错误时,我的问题就解决了。下面是我的最终代码:

void HexString2Bytes(/*IN*/ char* hexstr, /*OUT*/  BYTE* dst)
{
    static uint_fast8_t LOOKUP[256];
    for (int i = 0; i < 10; i++)
    {
        LOOKUP['0' + i] = i;
    }
    for (int i = 0; i < 6; i++)
    {
        LOOKUP['A' + i] = 0xA + i;
    }

    for (size_t i = 0; hexstr[i] != '\0'; i += 2)
    {
        *dst = LOOKUP[hexstr[i]] << 4 |
            LOOKUP[hexstr[i + 1]];
        dst++;
    }
}
Run Code Online (Sandbox Code Playgroud)

顺便说一句,真诚地感谢你们。你太棒了!真正的研究人员!

Lun*_*din 5

创建最高效代码的标准方法(以 RAM/ROM 为代价)是使用查找表。像这样的东西:

static const uint_fast8_t LOOKUP [256] =
{
  ['0'] = 0x0, ['1'] = 0x1, ['2'] = 0x2, ['3'] = 0x3,
  ['4'] = 0x4, ['5'] = 0x5, ['6'] = 0x6, ['7'] = 0x7,
  ['8'] = 0x8, ['9'] = 0x9, ['A'] = 0xA, ['B'] = 0xB,
  ['C'] = 0xC, ['D'] = 0xD, ['E'] = 0xE, ['F'] = 0xF,
};
Run Code Online (Sandbox Code Playgroud)

这牺牲了 256 字节的只读内存,因此我们不必进行任何形式的算术运算。如果uint_fast8_t编译器认为这有助于提高性能,则它可以让编译器选择更大的类型。

完整的代码将是这样的:

void hexstr_to_bytes (const char* restrict hexstr, uint8_t* restrict dst)
{
  static const uint_fast8_t LOOKUP [256] =
  {
    ['0'] = 0x0, ['1'] = 0x1, ['2'] = 0x2, ['3'] = 0x3,
    ['4'] = 0x4, ['5'] = 0x5, ['6'] = 0x6, ['7'] = 0x7,
    ['8'] = 0x8, ['9'] = 0x9, ['A'] = 0xA, ['B'] = 0xB,
    ['C'] = 0xC, ['D'] = 0xD, ['E'] = 0xE, ['F'] = 0xF,
  };
  
  for(size_t i=0; hexstr[i]!='\0'; i+=2)
  {
    *dst = LOOKUP[ hexstr[i  ] ] << 4 |
           LOOKUP[ hexstr[i+1] ];
    dst++;
  }
}
Run Code Online (Sandbox Code Playgroud)

在 x86_64 ( Godbolt )上测试时,这可以归结为大约 10 条指令。除循环条件外,无分支。值得注意的是,没有任何错误检查,因此您必须确保其他地方的数据正常(并且包含偶数个半字节)。

测试代码:

#include <stdio.h>
#include <stdint.h>

void hexstr_to_bytes (const char* restrict hexstr, uint8_t* restrict dst)
{
  static const uint_fast8_t LOOKUP [256] =
  {
    ['0'] = 0x0, ['1'] = 0x1, ['2'] = 0x2, ['3'] = 0x3,
    ['4'] = 0x4, ['5'] = 0x5, ['6'] = 0x6, ['7'] = 0x7,
    ['8'] = 0x8, ['9'] = 0x9, ['A'] = 0xA, ['B'] = 0xB,
    ['C'] = 0xC, ['D'] = 0xD, ['E'] = 0xE, ['F'] = 0xF,
  };
  
  for(size_t i=0; hexstr[i]!='\0'; i+=2)
  {
    *dst = LOOKUP[ hexstr[i  ] ] << 4 |
           LOOKUP[ hexstr[i+1] ];
    dst++;
  }
}

int main (void)
{
  const char hexstr[] = "DEADBEEFC0FFEE";
  uint8_t bytes [(sizeof hexstr - 1)/2];
  hexstr_to_bytes(hexstr, bytes);
  
  for(size_t i=0; i<sizeof bytes; i++)
  {
    printf("%.2X ", bytes[i]);
  }
}
Run Code Online (Sandbox Code Playgroud)

  • 另一种选择是使用两个数组;其值已经发生变化的一个。这将在不使用 64k mem 的情况下节省循环中的移位。 (2认同)