C++性能挑战:整数到std :: string的转换

Ben*_*igt 118 c++ string performance integer

任何人都可以击败我的整数到std :: string代码的性能,链接如下?

已经有几个问题可以解释如何将整数转换为std::stringC++中的整数,例如这个,但是所提供的解决方案都没有效率.

以下是一些可用于竞争的常用方法的编译就绪代码:

流行的看法相反,boost::lexical_cast有自己的实现(白皮书)并且不使用stringstream和数字插入运算符.我真的希望看到它的性能比较,因为另一个问题表明它很悲惨.

我自己的贡献,在桌面计算机上具有竞争力,并演示了一种在嵌入式系统上全速运行的方法,与依赖整数模数的算法不同:

如果您想使用该代码,我将根据简化的BSD许可证提供(允许商业使用,需要归属).请问.

最后,该功能ltoa是非标准的,但可广泛使用.

  • ltoa版本,适用于拥有提供它的编译器的任何人(ideone没有):http://ideone.com/T5Wim

我很快就会发布我的性能测量结果.

算法规则

  • 提供将至少32位有符号和无符号整数转换为十进制的代码.
  • 产生输出作为std::string.
  • 没有与线程和信号不兼容的技巧(例如,静态缓冲区).
  • 您可以假设ASCII字符集.
  • 确保INT_MIN在绝对值无法表示的二进制补码机上测试代码.
  • 理想情况下,输出应为字符的字符与规范的C++版本使用相同的stringstream,http://ideone.com/jh3Sa,但任何事情,这显然是理解的,因为正确的号码也是OK.
  • :虽然你可以使用你想要进行比较的任何编译器和优化器选项(除了完全禁用),但代码还需要编译并在至少VC++ 2010和g ++下给出正确的结果.

希望讨论

除了更好的算法,我还想在几个不同的平台和编译器上获得一些基准测试(让我们使用MB/s吞吐量作为我们的标准测量单位).我相信我的算法代码(我知道sprintf基准测试需要一些快捷方式 - 现在已经修复)是标准的明确定义的行为,至少在ASCII假设下,但是如果你看到任何未定义的行为或输出的输出无效,请指出.

结论:

不同的算法对g ++和VC2010执行,可能是由于std::string每个算法的实现不同.VC2010显然在NRVO方面做得更好,摆脱了价值回报只对gcc有帮助.

发现代码的性能优于sprintf一个数量级. ostringstream落后50倍甚至更多.

挑战的胜利者是user434507,他在gcc上生成的代码运行速度是我自己的350%.由于SO社区的突发奇想,其他条目将被关闭.

目前(最终?)速度冠军是:

  • 对于gcc:user434507,速度比sprintf:http://ideone.com/0uhhX快8倍
  • 对于Visual C++:蒂莫,以15比快倍sprintf:http://ideone.com/VpKO3

Eug*_*ith 33

#include <string>

const char digit_pairs[201] = {
  "00010203040506070809"
  "10111213141516171819"
  "20212223242526272829"
  "30313233343536373839"
  "40414243444546474849"
  "50515253545556575859"
  "60616263646566676869"
  "70717273747576777879"
  "80818283848586878889"
  "90919293949596979899"
};


std::string& itostr(int n, std::string& s)
{
    if(n==0)
    {
        s="0";
        return s;
    }

    int sign = -(n<0);
    unsigned int val = (n^sign)-sign;

    int size;
    if(val>=10000)
    {
        if(val>=10000000)
        {
            if(val>=1000000000)
                size=10;
            else if(val>=100000000)
                size=9;
            else 
                size=8;
        }
        else
        {
            if(val>=1000000)
                size=7;
            else if(val>=100000)
                size=6;
            else
                size=5;
        }
    }
    else 
    {
        if(val>=100)
        {
            if(val>=1000)
                size=4;
            else
                size=3;
        }
        else
        {
            if(val>=10)
                size=2;
            else
                size=1;
        }
    }
    size -= sign;
    s.resize(size);
    char* c = &s[0];
    if(sign)
        *c='-';

    c += size-1;
    while(val>=100)
    {
       int pos = val % 100;
       val /= 100;
       *(short*)(c-1)=*(short*)(digit_pairs+2*pos); 
       c-=2;
    }
    while(val>0)
    {
        *c--='0' + (val % 10);
        val /= 10;
    }
    return s;
}

std::string& itostr(unsigned val, std::string& s)
{
    if(val==0)
    {
        s="0";
        return s;
    }

    int size;
    if(val>=10000)
    {
        if(val>=10000000)
        {
            if(val>=1000000000)
                size=10;
            else if(val>=100000000)
                size=9;
            else 
                size=8;
        }
        else
        {
            if(val>=1000000)
                size=7;
            else if(val>=100000)
                size=6;
            else
                size=5;
        }
    }
    else 
    {
        if(val>=100)
        {
            if(val>=1000)
                size=4;
            else
                size=3;
        }
        else
        {
            if(val>=10)
                size=2;
            else
                size=1;
        }
    }

    s.resize(size);
    char* c = &s[size-1];
    while(val>=100)
    {
       int pos = val % 100;
       val /= 100;
       *(short*)(c-1)=*(short*)(digit_pairs+2*pos); 
       c-=2;
    }
    while(val>0)
    {
        *c--='0' + (val % 10);
        val /= 10;
    }
    return s;
}
Run Code Online (Sandbox Code Playgroud)

这将炸毁在禁止使用未对齐的内存访问(在这种情况下,通过第一未对准分配系统*(short*)会导致段错误),但应该很很好地工作,否则.

一件重要的事情是尽量减少使用std::string.(讽刺,我知道.)例如,在Visual Studio中,即使在编译器选项中指定/ Ob2,也不会内联对std :: string方法的大多数调用.因此,即使是一个像调用一样微不足道的东西std::string::clear(),你可能期望非常快,在将CRT作为静态库链接时可以获得100个时钟信号,而当作为DLL链接时可以多达300个时钟信号.

出于同样的原因,通过引用返回更好,因为它避免了赋值,构造函数和析构函数.

  • 作为记录,上面的答案是“user434507”算法。 (2认同)

Chr*_*man 21

啊,顺便说一句很棒的挑战......我玩得很开心.

我有两个提交的算法(如果你想跳过它,代码就在底部).在我的比较中,我要求函数返回一个字符串,并且它可以处理int和unsigned int.将不构造字符串的内容与那些不构造字符串的内容进行比较并不是很有意义.

第一个是一个有趣的实现,它不使用任何预先计算的查找表或显式除法/模数.这个与gcc的其他人以及msvc上除了Timo之外的所有人都很有竞争力(我在下面解释的原因很好).第二种算法是我实际提交的最高性能.在我的测试中,它击败了gcc和msvc上的所有其他人.

我想我知道为什么MSVC上的一些结果非常好.std :: string有两个相关的构造函数 std::string(char* str, size_t n)
,
std::string(ForwardIterator b, ForwardIterator e)
gcc为它们做同样的事情......就是它使用第二个来实现第一个.第一个构造函数可以比这更有效地实现,MSVC也是如此.这样做的另一个好处是,在某些情况下(比如我的快速代码和Timo代码),可以内联字符串构造函数.事实上,只需在MSVC中切换这些构造函数,我的代码就差不多是2倍.

我的性能测试结果:

代码来源:

- Voigt
- Timo
- ergosys
- user434507
- user-voigt-timo
- hopman-fun
- hopman-fast

gcc 4.4.5 -O2在Ubuntu 10.10 64位,Core i5上

hopman_fun: 124.688  MB/sec --- 8.020 s
hopman_fast: 137.552  MB/sec --- 7.270 s
voigt: 120.192  MB/sec --- 8.320 s
user_voigt_timo: 97.9432  MB/sec --- 10.210 s
timo: 120.482  MB/sec --- 8.300 s
user: 97.7517  MB/sec --- 10.230 s
ergosys: 101.42  MB/sec --- 9.860 s

Windows 7 64位,酷睿i5上的MSVC 2010 64位/ Ox

hopman_fun: 127  MB/sec --- 7.874 s
hopman_fast: 259  MB/sec --- 3.861 s
voigt: 221.435  MB/sec --- 4.516 s
user_voigt_timo: 195.695  MB/sec --- 5.110 s
timo: 253.165  MB/sec --- 3.950 s
user: 212.63  MB/sec --- 4.703 s
ergosys: 78.0518  MB/sec --- 12.812 s

以下是ideone上的一些结果和测试/时序框架
http://ideone.com/XZRqp
请注意,ideone是一个32位环境.我的两个算法都受此影响,但是hopman_fast至少仍然具有竞争力.

请注意,对于那些不构造字符串的人,我添加了以下函数模板:

template <typename T>
std::string itostr(T t) {
    std::string ret;
    itostr(t, ret);
    return ret;
}
Run Code Online (Sandbox Code Playgroud)

现在我的代码...首先是有趣的一个:

    // hopman_fun

template <typename T> 
T reduce2(T v) {
    T k = ((v * 410) >> 12) & 0x000F000F000F000Full;
    return (((v - k * 10) << 8) + k);
}

template <typename T>
T reduce4(T v) {
    T k = ((v * 10486) >> 20) & 0xFF000000FFull;
    return reduce2(((v - k * 100) << 16) + (k));
}

typedef unsigned long long ull;
inline ull reduce8(ull v) {
    ull k = ((v * 3518437209u) >> 45);
    return reduce4(((v - k * 10000) << 32) + (k));
}

template <typename T>
std::string itostr(T o) {
    union {
        char str[16];
        unsigned short u2[8];
        unsigned u4[4];
        unsigned long long u8[2];
    };

    unsigned v = o < 0 ? ~o + 1 : o;

    u8[0] = (ull(v) * 3518437209u) >> 45;
    u8[0] = (u8[0] * 28147497672ull);
    u8[1] = v - u2[3] * 100000000;

    u8[1] = reduce8(u8[1]);
    char* f;
    if (u2[3]) {
        u2[3] = reduce2(u2[3]);
        f = str + 6;
    } else {
        unsigned short* k = u4[2] ? u2 + 4 : u2 + 6;
        f = *k ? (char*)k : (char*)(k + 1);
    }
    if (!*f) f++;

    u4[1] |= 0x30303030;
    u4[2] |= 0x30303030;
    u4[3] |= 0x30303030;
    if (o < 0) *--f = '-';
    return std::string(f, (str + 16) - f);
}
Run Code Online (Sandbox Code Playgroud)

快速的一个:

    // hopman_fast

struct itostr_helper {
    static unsigned out[10000];

    itostr_helper() {
        for (int i = 0; i < 10000; i++) {
            unsigned v = i;
            char * o = (char*)(out + i);
            o[3] = v % 10 + '0';
            o[2] = (v % 100) / 10 + '0';
            o[1] = (v % 1000) / 100 + '0';
            o[0] = (v % 10000) / 1000;
            if (o[0]) o[0] |= 0x30;
            else if (o[1] != '0') o[0] |= 0x20;
            else if (o[2] != '0') o[0] |= 0x10;
            else o[0] |= 0x00;
        }
    }
};
unsigned itostr_helper::out[10000];

itostr_helper hlp_init;

template <typename T>
std::string itostr(T o) {
    typedef itostr_helper hlp;

    unsigned blocks[3], *b = blocks + 2;
    blocks[0] = o < 0 ? ~o + 1 : o;
    blocks[2] = blocks[0] % 10000; blocks[0] /= 10000;
    blocks[2] = hlp::out[blocks[2]];

    if (blocks[0]) {
        blocks[1] = blocks[0] % 10000; blocks[0] /= 10000;
        blocks[1] = hlp::out[blocks[1]];
        blocks[2] |= 0x30303030;
        b--;
    }

    if (blocks[0]) {
        blocks[0] = hlp::out[blocks[0] % 10000];
        blocks[1] |= 0x30303030;
        b--;
    }

    char* f = ((char*)b);
    f += 3 - (*f >> 4);

    char* str = (char*)blocks;
    if (o < 0) *--f = '-';
    return std::string(f, (str + 12) - f);
}
Run Code Online (Sandbox Code Playgroud)


Ben*_*igt 12

问题中提供的代码的基准数据:

关于ideone(gcc 4.3.4):

Core i7,Windows 7 64位,8 GB RAM,Visual C++ 2010 32位:

cl /Ox /EHsc

  • stringstreams:3.39 MB/s,3.67 MB/s
  • sprintf:16.8 MB/s,16.2 MB/s
  • 我的:194 MB/s,207 MB/s(启用PGO:250 MB/s)

Core i7,Windows 7 64位,8 GB RAM,Visual C++ 2010 64位:

cl /Ox /EHsc

  • stringstreams:4.42 MB/s,4.92 MB/s
  • sprintf:21.0 MB/s,20.8 MB/s
  • 我的:238 MB/s,228 MB/s

Core i7,Windows 7 64位,8 GB RAM,cygwin gcc 4.3.4:

g++ -O3

  • stringstreams:2.19 MB/s,2.17 MB/s
  • sprintf:13.1 MB/s,13.4 MB/s
  • 我的:30.0 MB/s,30.2 MB/s

编辑:我要添加自己的答案,但问题是关闭的,所以我在这里添加它.:)我编写了自己的算法,并设法对Ben的代码进行了不错的改进,虽然我只在MSVC 2010中进行了测试.我还对目前为止提供的所有实现进行了基准测试,使用了Ben的原始版本中的相同测试设置.码. - 蒂莫

英特尔Q9450,Win XP 32位,MSVC 2010

cl /O2 /EHsc

  • stringstream:2.87 MB/s
  • sprintf:16.1 MB/s
  • Ben:202 MB/s
  • Ben(无符号缓冲区):82.0 MB/s
  • ergosys(更新版本):64.2 MB/s
  • user434507:172 MB/s
  • Timo:241 MB/s

-

const char digit_pairs[201] = {
  "00010203040506070809"
  "10111213141516171819"
  "20212223242526272829"
  "30313233343536373839"
  "40414243444546474849"
  "50515253545556575859"
  "60616263646566676869"
  "70717273747576777879"
  "80818283848586878889"
  "90919293949596979899"
};

static const int BUFFER_SIZE = 11;

std::string itostr(int val)
{
  char buf[BUFFER_SIZE];
  char *it = &buf[BUFFER_SIZE-2];

  if(val>=0) {
    int div = val/100;
    while(div) {
      memcpy(it,&digit_pairs[2*(val-div*100)],2);
      val = div;
      it-=2;
      div = val/100;
    }
    memcpy(it,&digit_pairs[2*val],2);
    if(val<10)
      it++;
  } else {
    int div = val/100;
    while(div) {
      memcpy(it,&digit_pairs[-2*(val-div*100)],2);
      val = div;
      it-=2;
      div = val/100;
    }
    memcpy(it,&digit_pairs[-2*val],2);
    if(val<=-10)
      it--;
    *it = '-';
  }

  return std::string(it,&buf[BUFFER_SIZE]-it);
}

std::string itostr(unsigned int val)
{
  char buf[BUFFER_SIZE];
  char *it = (char*)&buf[BUFFER_SIZE-2];

  int div = val/100;
  while(div) {
    memcpy(it,&digit_pairs[2*(val-div*100)],2);
    val = div;
    it-=2;
    div = val/100;
  }
  memcpy(it,&digit_pairs[2*val],2);
  if(val<10)
    it++;

  return std::string((char*)it,(char*)&buf[BUFFER_SIZE]-(char*)it);
}
Run Code Online (Sandbox Code Playgroud)


Mar*_* Ba 10

虽然我们在这里得到的算法信息非常好,我认为问题是"破碎",我会解释为什么我认为这个:

问题要求采用int- > std::string转换的性能,这在比较常用方法时可能会很有用,例如不同的字符串流实现或boost :: lexical_cast.但是,在要求使用新代码(专用算法)时,它没有意义.原因是int2string总是涉及来自std :: string的堆分配,如果我们试图挤出转换算法的最后一个,我认为将这些测量与std完成的堆分配混合起来是不合理的: :串.如果我想要高性能转换,我将始终使用固定大小的缓冲区,当然永远不会在堆上分配任何东西!

总而言之,我认为时间应该分开:

  • 首先,最快(int - >固定缓冲区)转换.
  • 二,定时(固定缓冲区 - > std :: string)复制.
  • 第三,检查std :: string分配如何直接用作缓冲区,以保存复制.

恕我直言,这些方面不应该在一个时间混淆.

  • <quote> int2string将始终涉及来自std :: string的堆分配</ quote>不使用小字符串优化,这在标准库的大多数当前实现中都存在. (3认同)
  • @Ben - 好评.ESP.sm.str.opt.在判断std.string性能时,我将来必须记住这件事. (3认同)

erg*_*sys 6

我无法在VS下测试,但这似乎比你的g ++代码要快,大约10%.它可能被调整,选择的决策值是猜测.只有,抱歉.

typedef unsigned buf_t; 

static buf_t * reduce(unsigned val, buf_t * stp) {
   unsigned above = val / 10000; 
   if (above != 0) {
      stp = reduce(above, stp); 
      val -= above * 10000; 
   }

   buf_t digit  = val / 1000; 
   *stp++ = digit + '0'; 
   val -= digit * 1000; 

   digit  = val / 100; 
   *stp++ = digit + '0'; 
   val -= digit * 100; 

   digit  = val / 10; 
   *stp++ = digit + '0'; 
   val -= digit * 10; 
   *stp++ = val + '0'; 
   return stp; 
}

std::string itostr(int input) {

   buf_t buf[16]; 


   if(input == INT_MIN) {  
      char buf2[16]; 
      std::sprintf(buf2, "%d", input); 
      return std::string(buf2); 
   }

   // handle negative
   unsigned val = input;
   if(input < 0) 
      val = -input;

   buf[0] = '0'; 
   buf_t* endp = reduce(val, buf+1); 
   *endp = 127; 

   buf_t * stp = buf+1; 
   while (*stp == '0') 
      stp++;
   if (stp == endp)
      stp--; 

   if (input < 0) { 
      stp--; 
      *stp = '-'; 
   }
   return std::string(stp, endp); 
}
Run Code Online (Sandbox Code Playgroud)


小智 6

更新了我的答案... modp_ufast ...

Integer To String Test (Type 1)
[modp_ufast]Numbers: 240000000  Total:   657777786      Time:  1.1633sec        Rate:206308473.0686nums/sec
[sprintf] Numbers: 240000000    Total:   657777786      Time: 24.3629sec        Rate:  9851045.8556nums/sec
[karma]   Numbers: 240000000    Total:   657777786      Time:  5.2389sec        Rate: 45810870.7171nums/sec
[strtk]   Numbers: 240000000    Total:   657777786      Time:  3.3126sec        Rate: 72450283.7492nums/sec
[so   ]   Numbers: 240000000    Total:   657777786      Time:  3.0828sec        Rate: 77852152.8820nums/sec
[timo ]   Numbers: 240000000    Total:   657777786      Time:  4.7349sec        Rate: 50687912.9889nums/sec
[voigt]   Numbers: 240000000    Total:   657777786      Time:  5.1689sec        Rate: 46431985.1142nums/sec
[hopman]  Numbers: 240000000    Total:   657777786      Time:  4.6169sec        Rate: 51982554.6497nums/sec
Press any key to continue . . .

Integer To String Test(Type 2)
[modp_ufast]Numbers: 240000000  Total:   660000000      Time:  0.5072sec        Rate:473162716.4618nums/sec
[sprintf] Numbers: 240000000    Total:   660000000      Time: 22.3483sec        Rate: 10739062.9383nums/sec
[karma]   Numbers: 240000000    Total:   660000000      Time:  4.2471sec        Rate: 56509024.3035nums/sec
[strtk]   Numbers: 240000000    Total:   660000000      Time:  2.1683sec        Rate:110683636.7123nums/sec
[so   ]   Numbers: 240000000    Total:   660000000      Time:  2.7133sec        Rate: 88454602.1423nums/sec
[timo ]   Numbers: 240000000    Total:   660000000      Time:  2.8030sec        Rate: 85623453.3872nums/sec
[voigt]   Numbers: 240000000    Total:   660000000      Time:  3.4019sec        Rate: 70549286.7776nums/sec
[hopman]  Numbers: 240000000    Total:   660000000      Time:  2.7849sec        Rate: 86178023.8743nums/sec
Press any key to continue . . .

Integer To String Test (type 3)
[modp_ufast]Numbers: 240000000  Total:   505625000      Time:  1.6482sec        Rate:145610315.7819nums/sec
[sprintf] Numbers: 240000000    Total:   505625000      Time: 20.7064sec        Rate: 11590618.6109nums/sec
[karma]   Numbers: 240000000    Total:   505625000      Time:  4.3036sec        Rate: 55767734.3570nums/sec
[strtk]   Numbers: 240000000    Total:   505625000      Time:  2.9297sec        Rate: 81919227.9275nums/sec
[so   ]   Numbers: 240000000    Total:   505625000      Time:  3.0278sec        Rate: 79266003.8158nums/sec
[timo ]   Numbers: 240000000    Total:   505625000      Time:  4.0631sec        Rate: 59068204.3266nums/sec
[voigt]   Numbers: 240000000    Total:   505625000      Time:  4.5616sec        Rate: 52613393.0285nums/sec
[hopman]  Numbers: 240000000    Total:   505625000      Time:  4.1248sec        Rate: 58184194.4569nums/sec
Press any key to continue . . .

int ufast_utoa10(unsigned int value, char* str)
{
#define JOIN(N) N "0", N "1", N "2", N "3", N "4", N "5", N "6", N "7", N "8", N "9"
#define JOIN2(N) JOIN(N "0"), JOIN(N "1"), JOIN(N "2"), JOIN(N "3"), JOIN(N "4"), \
                 JOIN(N "5"), JOIN(N "6"), JOIN(N "7"), JOIN(N "8"), JOIN(N "9")
#define JOIN3(N) JOIN2(N "0"), JOIN2(N "1"), JOIN2(N "2"), JOIN2(N "3"), JOIN2(N "4"), \
                 JOIN2(N "5"), JOIN2(N "6"), JOIN2(N "7"), JOIN2(N "8"), JOIN2(N "9")
#define JOIN4    JOIN3("0"), JOIN3("1"), JOIN3("2"), JOIN3("3"), JOIN3("4"), \
                 JOIN3("5"), JOIN3("6"), JOIN3("7"), JOIN3("8"), JOIN3("9")
#define JOIN5(N) JOIN(N), JOIN(N "1"), JOIN(N "2"), JOIN(N "3"), JOIN(N "4"), \
                 JOIN(N "5"), JOIN(N "6"), JOIN(N "7"), JOIN(N "8"), JOIN(N "9")
#define JOIN6    JOIN5(), JOIN5("1"), JOIN5("2"), JOIN5("3"), JOIN5("4"), \
                 JOIN5("5"), JOIN5("6"), JOIN5("7"), JOIN5("8"), JOIN5("9")
#define F(N)     ((N) >= 100 ? 3 : (N) >= 10 ? 2 : 1)
#define F10(N)   F(N),F(N+1),F(N+2),F(N+3),F(N+4),F(N+5),F(N+6),F(N+7),F(N+8),F(N+9)
#define F100(N)  F10(N),F10(N+10),F10(N+20),F10(N+30),F10(N+40),\
                 F10(N+50),F10(N+60),F10(N+70),F10(N+80),F10(N+90)
  static const short offsets[] = { F100(0), F100(100), F100(200), F100(300), F100(400),
                                  F100(500), F100(600), F100(700), F100(800), F100(900)};
  static const char table1[][4] = { JOIN("") }; 
  static const char table2[][4] = { JOIN2("") }; 
  static const char table3[][4] = { JOIN3("") };
  static const char table4[][5] = { JOIN4 }; 
  static const char table5[][4] = { JOIN6 };
#undef JOIN
#undef JOIN2
#undef JOIN3
#undef JOIN4
  char *wstr;
  int remains[2];
  unsigned int v2;
  if (value >= 100000000) {
    v2 = value / 10000;
    remains[0] = value - v2 * 10000;
    value = v2;
    v2 = value / 10000;
    remains[1] = value - v2 * 10000;
    value = v2;
    wstr = str;
    if (value >= 1000) {
      *(__int32 *) wstr = *(__int32 *) table4[value];
      wstr += 4;
    } else {
      *(__int32 *) wstr = *(__int32 *) table5[value];
      wstr += offsets[value];
    }
    *(__int32 *) wstr = *(__int32 *) table4[remains[1]];
    wstr += 4;
    *(__int32 *) wstr = *(__int32 *) table4[remains[0]];
    wstr += 4;
    *wstr = 0;
    return (wstr - str);
  }
  else if (value >= 10000) {
    v2 = value / 10000;
    remains[0] = value - v2 * 10000;
    value = v2;
    wstr = str;
    if (value >= 1000) {
      *(__int32 *) wstr = *(__int32 *) table4[value];
      wstr += 4;
      *(__int32 *) wstr = *(__int32 *) table4[remains[0]];
      wstr += 4;
      *wstr = 0;
      return 8;
    } else {
      *(__int32 *) wstr = *(__int32 *) table5[value];
      wstr += offsets[value];
      *(__int32 *) wstr = *(__int32 *) table4[remains[0]];
      wstr += 4;
      *wstr = 0;
      return (wstr - str);
    }
  }
  else {
    if (value >= 1000) {
      *(__int32 *) str = *(__int32 *) table4[value];
      str += 4;
      *str = 0;
      return 4;
    } else if (value >= 100) {
      *(__int32 *) str = *(__int32 *) table3[value];
      return 3;
    } else if (value >= 10) {
      *(__int16 *) str = *(__int16 *) table2[value];
      str += 2;
      *str = 0;
      return 2;
    } else {
      *(__int16 *) str = *(__int16 *) table1[value];
      return 1;
    }
  }
}

int ufast_itoa10(int value, char* str) {
  if (value < 0) { *(str++) = '-'; 
    return ufast_utoa10(-value, str) + 1; 
  }
  else return ufast_utoa10(value, str);
}


    void ufast_test() {

   print_mode("[modp_ufast]");

   std::string s;
   s.reserve(32);
   std::size_t total_length = 0;
   strtk::util::timer t;
   t.start();

   char buf[128];
   int len;
   for (int i = (-max_i2s / 2); i < (max_i2s / 2); ++i)
   {
      #ifdef enable_test_type01
      s.resize(ufast_itoa10(((i & 1) ? i : -i), const_cast<char*>(s.c_str())));
      total_length += s.size();
      #endif

      #ifdef enable_test_type02
      s.resize(ufast_itoa10(max_i2s + i, const_cast<char*>(s.c_str())));
      total_length += s.size();
      #endif

      #ifdef enable_test_type03
      s.resize(ufast_itoa10(randval[(max_i2s + i) & 1023], const_cast<char*>(s.c_str())));
      total_length += s.size();
      #endif
   }
   t.stop();
   printf("Numbers:%10lu\tTotal:%12lu\tTime:%8.4fsec\tRate:%14.4fnums/sec\n",
          static_cast<unsigned long>(3 * max_i2s),
          static_cast<unsigned long>(total_length),
          t.time(),
          (3.0 * max_i2s) / t.time());
}
Run Code Online (Sandbox Code Playgroud)

  • 你永远不会把它放入字符串中.另外我不知道为什么你的其他人的代码的结果如此之低,你的CPU也不慢. (4认同)