哪一个更快?正则表达式还是EndsWith?

Dys*_*ial 26 c# regex windows string performance

什么会更快?

public String Roll()
{
    Random rnd = new Random();
    int roll = rnd.Next(1, 100000);
    if (Regex.IsMatch(roll.ToString(), @"(.)\1{1,}$"))
    {
        return "doubles";
    }
    return "none";
}
Run Code Online (Sandbox Code Playgroud)

要么

public String Roll()
{
    Random rnd = new Random();
    int roll = rnd.Next(1, 100000);
    if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22")  || roll.ToString().EndsWith("33")  || roll.ToString().EndsWith("44")  || roll.ToString().EndsWith("55")  || roll.ToString().EndsWith("66")  || roll.ToString().EndsWith("77")  || roll.ToString().EndsWith("88")  || roll.ToString().EndsWith("99")  || roll.ToString().EndsWith("00"))
    {
        return "doubles";
    }
    return "none";
}
Run Code Online (Sandbox Code Playgroud)

我目前正在使用一个非常长的if语句列表,其中包含正则表达式,以检查int是否以双打,三元组,四元组,四元组等结尾...所以我想知道在更改所有内容之前哪一个更快.

Ian*_*Ian 34

In your particular case, Regex is actually faster... but it is likely because you use EndsWith with many OR and redundant ToString(). If you simplify the logic, simple string operation will likely be faster.


Here is the performance summary for text processing - from the fastest to the slowest (10 millions loop [Prefer/Non-Prefer 32-bit] - rank is ordered based on the fastest of the two):

  1. Large Lookup Fast Random UInt (not counted for bounty): 219/273 ms - Mine, improved from Evk's
  2. Large Lookup Optimized Random: 228/273 ms - Ivan Stoev's Alternate Solution
  3. Large Lookup Fast Random: 238/294 ms - Evk's Alternative Solution
  4. Large Lookup Parameterless Random: 248/287 ms - Thomas Ayoub

    我想在此解决方案上做几点说明(基于下面的评论):

    1. 此解决方案向小数字(<100000)引入0.0039%偏差(参考:Eric Lippert的博客文章,由Lucas Trzesniewski联系)
    2. 难道产生相同数量的序列,而其他被测试(参考:迈克尔·刘评论) -因为它改变使用方式Random(从Random.Next(int)Random.Next()),这是用于测试本身.

    虽然测试不能被具有完全相同的序列号对于该方法作为用于进行休息(如提到的Phil1970),我有两个点使:

    1. Some might be interested to look at the implement of Random.Next() vs Random.Next(int) to understand why this solution will still be faster even if the same sequence of numbers are used.
    2. The use of Random in the real case itself will (most of the time) not assume the number sequence to be the same (or predictable) - It is only for testing our method we want the Random sequence to be identical (for fair unit testing purpose). The expected faster result for this method cannot be fully derived from the testing result alone, but by also looking at the Next() vs Next(int) implementation.
  5. Large Look-up: 320/284 ms - Evk

  6. 最快的优化随机修改:286/333 ms Ivan Stoev
  7. Lookup Optimized Modded:315/329 ms - Corak
  8. 优化改装:471/330毫秒 - Stian Standahl
  9. 优化的Modded +常数:472/337 - GjermundGrøneng
  10. 最快的优化模块:345/340毫秒 - GjermundGrøneng
  11. 修改:496/370 ms - Corak +可能是Alexei Levenkov
  12. 数字:537/408毫秒 - Alois Kraus
  13. 简单:1668/1176 ms - 我的
  14. HashSet的包含:一千六百零九分之二千一百三十八毫秒- Dandré
  15. 列表包含:3013/2465 ms - 另一个矿井
  16. 编译正则表达式:8956/7675 ms - Radin Gospodinov
  17. 正则表达式:15032/16640 ms - OP的解决方案1
  18. EndsWith:24763/20702 ms - OP的解决方案2

以下是我的简单测试用例:

Random rnd = new Random(10000);
FastRandom fastRnd = new FastRandom(10000);

//OP's Solution 2
public String RollRegex() {
    int roll = rnd.Next(1, 100000);
    if (Regex.IsMatch(roll.ToString(), @"(.)\1{1,}$")) {
        return "doubles";
    } else {
        return "none";
    }
}

//Radin Gospodinov's Solution
Regex optionRegex = new Regex(@"(.)\1{1,}$", RegexOptions.Compiled);
public String RollOptionRegex() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    if (optionRegex.IsMatch(roll.ToString())) {
        return "doubles";
    } else {
        return "none";
    }
}

//OP's Solution 1
public String RollEndsWith() {
    int roll = rnd.Next(1, 100000);
    if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22") || roll.ToString().EndsWith("33") || roll.ToString().EndsWith("44") || roll.ToString().EndsWith("55") || roll.ToString().EndsWith("66") || roll.ToString().EndsWith("77") || roll.ToString().EndsWith("88") || roll.ToString().EndsWith("99") || roll.ToString().EndsWith("00")) {
        return "doubles";
    } else {
        return "none";
    }
}

//My Solution   
public String RollSimple() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    return roll > 10 && rollString[rollString.Length - 1] == rollString[rollString.Length - 2] ?
        "doubles" : "none";
}

//My Other Solution
List<string> doubles = new List<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public String RollAnotherSimple() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    return rollString.Length > 1 && doubles.Contains(rollString.Substring(rollString.Length - 2)) ?
        "doubles" : "none";
}

//Dandré's Solution
HashSet<string> doublesHashset = new HashSet<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public String RollSimpleHashSet() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    return rollString.Length > 1 && doublesHashset.Contains(rollString.Substring(rollString.Length - 2)) ?
        "doubles" : "none";
}

//Corak's Solution - hinted by Alexei Levenkov too
public string RollModded() { int roll = rnd.Next(1, 100000); return roll % 100 % 11 == 0 ? "doubles" : "none"; }

//Stian Standahl optimizes modded solution
public string RollOptimizedModded() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? "doubles" : "none"; }

//Gjermund Grøneng's method with constant addition
private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public string RollOptimizedModdedConst() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }

//Gjermund Grøneng's method after optimizing the Random (The fastest!)
public string FastestOptimizedModded() { return (rnd.Next(99999) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }

//Corak's Solution, added on Gjermund Grøneng's
private readonly string[] Lookup = { "doubles", "none", "none", "none", "none", "none", "none", "none", "none", "none", "none" };
public string RollLookupOptimizedModded() { return Lookup[(rnd.Next(99999) + 1) % 100 % 11]; }

//Evk's Solution, large Lookup
private string[] LargeLookup;
private void InitLargeLookup() {
    LargeLookup = new string[100000];
    for (int i = 0; i < 100000; i++) {
        LargeLookup[i] = i % 100 % 11 == 0 ? "doubles" : "none";
    }
}
public string RollLargeLookup() { return LargeLookup[rnd.Next(99999) + 1]; }

//Thomas Ayoub's Solution
public string RollLargeLookupParameterlessRandom() { return LargeLookup[rnd.Next() % 100000]; }

//Alois Kraus's Solution
public string RollNumbers() {
    int roll = rnd.Next(1, 100000);
    int lastDigit = roll % 10;
    int secondLastDigit = (roll / 10) % 10;

    if (lastDigit == secondLastDigit) {
        return "doubles";
    } else {
        return "none";
    }
}

//Ivan Stoev's Solution
public string FastestOptimizedRandomModded() {
    return ((int)(rnd.Next() * (99999.0 / int.MaxValue)) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}

//Ivan Stoev's Alternate Solution
public string RollLargeLookupOptimizedRandom() {
    return LargeLookup[(int)(rnd.Next() * (99999.0 / int.MaxValue))];
}

//Evk's Solution using FastRandom
public string RollLargeLookupFastRandom() {
    return LargeLookup[fastRnd.Next(99999) + 1];
}

//My Own Test, using FastRandom + NextUInt
public string RollLargeLookupFastRandomUInt() {
    return LargeLookup[fastRnd.NextUInt() % 99999 + 1];
}
Run Code Online (Sandbox Code Playgroud)

附加FastRandom课程:

//FastRandom's part used for the testing
public class FastRandom {
    // The +1 ensures NextDouble doesn't generate 1.0
    const double REAL_UNIT_INT = 1.0 / ((double)int.MaxValue + 1.0);
    const double REAL_UNIT_UINT = 1.0 / ((double)uint.MaxValue + 1.0);
    const uint Y = 842502087, Z = 3579807591, W = 273326509;

    uint x, y, z, w;

    #region Constructors

    /// <summary>
    /// Initialises a new instance using time dependent seed.
    /// </summary>
    public FastRandom() {
        // Initialise using the system tick count.
        Reinitialise((int)Environment.TickCount);
    }

    /// <summary>
    /// Initialises a new instance using an int value as seed.
    /// This constructor signature is provided to maintain compatibility with
    /// System.Random
    /// </summary>
    public FastRandom(int seed) {
        Reinitialise(seed);
    }

    #endregion

    #region Public Methods [Reinitialisation]

    /// <summary>
    /// Reinitialises using an int value as a seed.
    /// </summary>
    /// <param name="seed"></param>
    public void Reinitialise(int seed) {
        // The only stipulation stated for the xorshift RNG is that at least one of
        // the seeds x,y,z,w is non-zero. We fulfill that requirement by only allowing
        // resetting of the x seed
        x = (uint)seed;
        y = Y;
        z = Z;
        w = W;
    }

    #endregion

    #region Public Methods [System.Random functionally equivalent methods]

    /// <summary>
    /// Generates a random int over the range 0 to int.MaxValue-1.
    /// MaxValue is not generated in order to remain functionally equivalent to System.Random.Next().
    /// This does slightly eat into some of the performance gain over System.Random, but not much.
    /// For better performance see:
    /// 
    /// Call NextInt() for an int over the range 0 to int.MaxValue.
    /// 
    /// Call NextUInt() and cast the result to an int to generate an int over the full Int32 value range
    /// including negative values. 
    /// </summary>
    /// <returns></returns>
    public int Next() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

        // Handle the special case where the value int.MaxValue is generated. This is outside of 
        // the range of permitted values, so we therefore call Next() to try again.
        uint rtn = w & 0x7FFFFFFF;
        if (rtn == 0x7FFFFFFF)
            return Next();
        return (int)rtn;
    }

    /// <summary>
    /// Generates a random int over the range 0 to upperBound-1, and not including upperBound.
    /// </summary>
    /// <param name="upperBound"></param>
    /// <returns></returns>
    public int Next(int upperBound) {
        if (upperBound < 0)
            throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=0");

        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // The explicit int cast before the first multiplication gives better performance.
        // See comments in NextDouble.
        return (int)((REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * upperBound);
    }

    /// <summary>
    /// Generates a random int over the range lowerBound to upperBound-1, and not including upperBound.
    /// upperBound must be >= lowerBound. lowerBound may be negative.
    /// </summary>
    /// <param name="lowerBound"></param>
    /// <param name="upperBound"></param>
    /// <returns></returns>
    public int Next(int lowerBound, int upperBound) {
        if (lowerBound > upperBound)
            throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=lowerBound");

        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // The explicit int cast before the first multiplication gives better performance.
        // See comments in NextDouble.
        int range = upperBound - lowerBound;
        if (range < 0) {   // If range is <0 then an overflow has occured and must resort to using long integer arithmetic instead (slower).
            // We also must use all 32 bits of precision, instead of the normal 31, which again is slower.  
            return lowerBound + (int)((REAL_UNIT_UINT * (double)(w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))) * (double)((long)upperBound - (long)lowerBound));
        }

        // 31 bits of precision will suffice if range<=int.MaxValue. This allows us to cast to an int and gain
        // a little more performance.
        return lowerBound + (int)((REAL_UNIT_INT * (double)(int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * (double)range);
    }

    /// <summary>
    /// Generates a random double. Values returned are from 0.0 up to but not including 1.0.
    /// </summary>
    /// <returns></returns>
    public double NextDouble() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // Here we can gain a 2x speed improvement by generating a value that can be cast to 
        // an int instead of the more easily available uint. If we then explicitly cast to an 
        // int the compiler will then cast the int to a double to perform the multiplication, 
        // this final cast is a lot faster than casting from a uint to a double. The extra cast
        // to an int is very fast (the allocated bits remain the same) and so the overall effect 
        // of the extra cast is a significant performance improvement.
        //
        // Also note that the loss of one bit of precision is equivalent to what occurs within 
        // System.Random.
        return (REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))));
    }


    /// <summary>
    /// Fills the provided byte array with random bytes.
    /// This method is functionally equivalent to System.Random.NextBytes(). 
    /// </summary>
    /// <param name="buffer"></param>
    public void NextBytes(byte[] buffer) {
        // Fill up the bulk of the buffer in chunks of 4 bytes at a time.
        uint x = this.x, y = this.y, z = this.z, w = this.w;
        int i = 0;
        uint t;
        for (int bound = buffer.Length - 3; i < bound; ) {
            // Generate 4 bytes. 
            // Increased performance is achieved by generating 4 random bytes per loop.
            // Also note that no mask needs to be applied to zero out the higher order bytes before
            // casting because the cast ignores thos bytes. Thanks to Stefan Trosch?tz for pointing this out.
            t = (x ^ (x << 11));
            x = y; y = z; z = w;
            w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            buffer[i++] = (byte)w;
            buffer[i++] = (byte)(w >> 8);
            buffer[i++] = (byte)(w >> 16);
            buffer[i++] = (byte)(w >> 24);
        }

        // Fill up any remaining bytes in the buffer.
        if (i < buffer.Length) {
            // Generate 4 bytes.
            t = (x ^ (x << 11));
            x = y; y = z; z = w;
            w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            buffer[i++] = (byte)w;
            if (i < buffer.Length) {
                buffer[i++] = (byte)(w >> 8);
                if (i < buffer.Length) {
                    buffer[i++] = (byte)(w >> 16);
                    if (i < buffer.Length) {
                        buffer[i] = (byte)(w >> 24);
                    }
                }
            }
        }
        this.x = x; this.y = y; this.z = z; this.w = w;
    }


    //      /// <summary>
    //      /// A version of NextBytes that uses a pointer to set 4 bytes of the byte buffer in one operation
    //      /// thus providing a nice speedup. The loop is also partially unrolled to allow out-of-order-execution,
    //      /// this results in about a x2 speedup on an AMD Athlon. Thus performance may vary wildly on different CPUs
    //      /// depending on the number of execution units available.
    //      /// 
    //      /// Another significant speedup is obtained by setting the 4 bytes by indexing pDWord (e.g. pDWord[i++]=w)
    //      /// instead of adjusting it dereferencing it (e.g. *pDWord++=w).
    //      /// 
    //      /// Note that this routine requires the unsafe compilation flag to be specified and so is commented out by default.
    //      /// </summary>
    //      /// <param name="buffer"></param>
    //      public unsafe void NextBytesUnsafe(byte[] buffer)
    //      {
    //          if(buffer.Length % 8 != 0)
    //              throw new ArgumentException("Buffer length must be divisible by 8", "buffer");
    //
    //          uint x=this.x, y=this.y, z=this.z, w=this.w;
    //          
    //          fixed(byte* pByte0 = buffer)
    //          {
    //              uint* pDWord = (uint*)pByte0;
    //              for(int i=0, len=buffer.Length>>2; i < len; i+=2) 
    //              {
    //                  uint t=(x^(x<<11));
    //                  x=y; y=z; z=w;
    //                  pDWord[i] = w = (w^(w>>19))^(t^(t>>8));
    //
    //                  t=(x^(x<<11));
    //                  x=y; y=z; z=w;
    //                  pDWord[i+1] = w = (w^(w>>19))^(t^(t>>8));
    //              }
    //          }
    //
    //          this.x=x; this.y=y; this.z=z; this.w=w;
    //      }

    #endregion

    #region Public Methods [Methods not present on System.Random]

    /// <summary>
    /// Generates a uint. Values returned are over the full range of a uint, 
    /// uint.MinValue to uint.MaxValue, inclusive.
    /// 
    /// This is the fastest method for generating a single random number because the underlying
    /// random number generator algorithm generates 32 random bits that can be cast directly to 
    /// a uint.
    /// </summary>
    /// <returns></returns>
    public uint NextUInt() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        return (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)));
    }

    /// <summary>
    /// Generates a random int over the range 0 to int.MaxValue, inclusive. 
    /// This method differs from Next() only in that the range is 0 to int.MaxValue
    /// and not 0 to int.MaxValue-1.
    /// 
    /// The slight difference in range means this method is slightly faster than Next()
    /// but is not functionally equivalent to System.Random.Next().
    /// </summary>
    /// <returns></returns>
    public int NextInt() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        return (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))));
    }


    // Buffer 32 bits in bitBuffer, return 1 at a time, keep track of how many have been returned
    // with bitBufferIdx.
    uint bitBuffer;
    uint bitMask = 1;

    /// <summary>
    /// Generates a single random bit.
    /// This method's performance is improved by generating 32 bits in one operation and storing them
    /// ready for future calls.
    /// </summary>
    /// <returns></returns>
    public bool NextBool() {
        if (bitMask == 1) {
            // Generate 32 more bits.
            uint t = (x ^ (x << 11));
            x = y; y = z; z = w;
            bitBuffer = w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            // Reset the bitMask that tells us which bit to read next.
            bitMask = 0x80000000;
            return (bitBuffer & bitMask) == 0;
        }

        return (bitBuffer & (bitMask >>= 1)) == 0;
    }

    #endregion
}
Run Code Online (Sandbox Code Playgroud)

测试场景:

public delegate string RollDelegate();
private void Test() {
    List<string> rollMethodNames = new List<string>(){
        "Large Lookup Fast Random UInt",
        "Large Lookup Fast Random",
        "Large Lookup Optimized Random",
        "Fastest Optimized Random Modded",
        "Numbers",
        "Large Lookup Parameterless Random",
        "Large Lookup",
        "Lookup Optimized Modded",
        "Fastest Optimized Modded",
        "Optimized Modded Const",
        "Optimized Modded",
        "Modded",
        "Simple",
        "Another simple with HashSet",
        "Another Simple",
        "Option (Compiled) Regex",
        "Regex",
        "EndsWith",
    };
    List<RollDelegate> rollMethods = new List<RollDelegate>{
        RollLargeLookupFastRandomUInt,
        RollLargeLookupFastRandom,
        RollLargeLookupOptimizedRandom,
        FastestOptimizedRandomModded,
        RollNumbers,
        RollLargeLookupParameterlessRandom,
        RollLargeLookup,
        RollLookupOptimizedModded,
        FastestOptimizedModded,
        RollOptimizedModdedConst,
        RollOptimizedModded,
        RollModded,
        RollSimple,
        RollSimpleHashSet,
        RollAnotherSimple,
        RollOptionRegex,
        RollRegex,
        RollEndsWith
    };
    int trial = 10000000;
    InitLargeLookup();
    for (int k = 0; k < rollMethods.Count; ++k) {
        rnd = new Random(10000);
        fastRnd = new FastRandom(10000);
        logBox.GetTimeLapse();
        for (int i = 0; i < trial; ++i)
            rollMethods[k]();
        logBox.WriteTimedLogLine(rollMethodNames[k] + ": " + logBox.GetTimeLapse());
    }
}
Run Code Online (Sandbox Code Playgroud)

结果(首选32位):

[2016-05-30 08:20:54.056 UTC] Large Lookup Fast Random UInt: 219 ms
[2016-05-30 08:20:54.296 UTC] Large Lookup Fast Random: 238 ms
[2016-05-30 08:20:54.524 UTC] Large Lookup Optimized Random: 228 ms
[2016-05-30 08:20:54.810 UTC] Fastest Optimized Random Modded: 286 ms
[2016-05-30 08:20:55.347 UTC] Numbers: 537 ms
[2016-05-30 08:20:55.596 UTC] Large Lookup Parameterless Random: 248 ms
[2016-05-30 08:20:55.916 UTC] Large Lookup: 320 ms
[2016-05-30 08:20:56.231 UTC] Lookup Optimized Modded: 315 ms
[2016-05-30 08:20:56.577 UTC] Fastest Optimized Modded: 345 ms
[2016-05-30 08:20:57.049 UTC] Optimized Modded Const: 472 ms
[2016-05-30 08:20:57.521 UTC] Optimized Modded: 471 ms
[2016-05-30 08:20:58.017 UTC] Modded: 496 ms
[2016-05-30 08:20:59.685 UTC] Simple: 1668 ms
[2016-05-30 08:21:01.824 UTC] Another simple with HashSet: 2138 ms
[2016-05-30 08:21:04.837 UTC] Another Simple: 3013 ms
[2016-05-30 08:21:13.794 UTC] Option (Compiled) Regex: 8956 ms
[2016-05-30 08:21:28.827 UTC] Regex: 15032 ms
[2016-05-30 08:21:53.589 UTC] EndsWith: 24763 ms
Run Code Online (Sandbox Code Playgroud)

结果(非首选32位):

[2016-05-30 08:16:00.934 UTC] Large Lookup Fast Random UInt: 273 ms
[2016-05-30 08:16:01.230 UTC] Large Lookup Fast Random: 294 ms
[2016-05-30 08:16:01.503 UTC] Large Lookup Optimized Random: 273 ms
[2016-05-30 08:16:01.837 UTC] Fastest Optimized Random Modded: 333 ms
[2016-05-30 08:16:02.245 UTC] Numbers: 408 ms
[2016-05-30 08:16:02.532 UTC] Large Lookup Parameterless Random: 287 ms
[2016-05-30 08:16:02.816 UTC] Large Lookup: 284 ms
[2016-05-30 08:16:03.145 UTC] Lookup Optimized Modded: 329 ms
[2016-05-30 08:16:03.486 UTC] Fastest Optimized Modded: 340 ms
[2016-05-30 08:16:03.824 UTC] Optimized Modded Const: 337 ms
[2016-05-30 08:16:04.154 UTC] Optimized Modded: 330 ms
[2016-05-30 08:16:04.524 UTC] Modded: 370 ms
[2016-05-30 08:16:05.700 UTC] Simple: 1176 ms
[2016-05-30 08:16:07.309 UTC] Another simple with HashSet: 1609 ms
[2016-05-30 08:16:09.774 UTC] Another Simple: 2465 ms
[2016-05-30 08:16:17.450 UTC] Option (Compiled) Regex: 7675 ms
[2016-05-30 08:16:34.090 UTC] Regex: 16640 ms
[2016-05-30 08:16:54.793 UTC] EndsWith: 20702 ms
Run Code Online (Sandbox Code Playgroud)

和图片:

在此输入图像描述


小智 12

@StianStandahls朋友在这里.这个解决方案最快!它与@Ians答案中前一个最快的示例相同,但随机生成器在此处进行了优化.

private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public string FastestOptimizedModded()
{
    return (rnd.Next(99999)+1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}
Run Code Online (Sandbox Code Playgroud)

  • myyyyyyy ............ goodnesssssssssss ........... !!! +1我无法相信甚至还有*更快*的解决方案!今天我不能再测试答案了,但明天我肯定会把这个答案包含在测试中!当然,链接你的名字.就像其他人一样:)很高兴见到你...... :) (2认同)

atl*_*ste 5

至于大多数性能,我相信@Ian 已经很好地涵盖了这一点。所有功劳都归他所有。

在 Q/A 中没有得到令我满意的答案是为什么 Regex 在这种情况下优于 EndsWith。我觉得有必要解释其中的差异,以便人们意识到哪种解决方案在哪种情况下可能会更好。

以。。结束

EndsWith 功能基本上是按顺序对字符串的一部分进行“比较”。像这样的东西:

bool EndsWith(string haystack, string needle)
{
    bool equal = haystack.Length >= needle.Length;
    for (int i=0; i<needle.Length && equal; ++i)
    {
        equal = s[i] == needle[needle.Length - haystack.Length + i];
    }
    return equal;
}
Run Code Online (Sandbox Code Playgroud)

代码非常简单;我们只需获取第一个字符,看看它是否匹配,然后是下一个字符,依此类推 - 直到我们到达字符串的末尾。

正则表达式

正则表达式的工作方式不同。考虑在一个非常大的干草堆中寻找针“foofoo”。明显的实现是在第一个字符处检查它是否是“f”,然后移动到下一个字符,依此类推,直到到达字符串的末尾。然而,我们可以做得更好:

仔细看看任务。如果我们首先查看字符串的第 5 个字符,并注意到它不是“o”(最后一个字符),我们可以立即跳到第 11 个字符并再次检查它是否是“o”。这样,我们将在最好的情况下比原始代码获得 6 倍的良好改进,并在最坏的情况下获得相同的性能。

另请注意,正则表达式可能会因“或”、“和”等而变得更加复杂。如果我们只需要查看尾随字符,则进行前向扫描不再有意义。

这就是为什么正则表达式通常与编译为 DFA 的 NFA 一起使用。这里有一个很棒的在线工具:http://hackingoff.com/compilers/regular-expression-to-nfa-dfa,它显示了它的样子(对于简单的正则表达式)。

在内部,您可以要求 .NET 使用以下命令编译Reflection.Emit正则表达式,当您使用正则表达式时,您实际上会评估此优化的已编译状态机 ( RegexOptions.Compiled)。

你最终可能会得到这样的结果:

bool Matches(string haystack)
{
    char _1;
    int index = 0;

    // match (.)
state0:
    if (index >= haystack.Length) 
    {
        goto stateFail;
    }
    _1 = haystack[index]; 
    state = 1;
    ++index;
    goto state1;

    // match \1{1}
state1:
    if (index >= haystack.Length) 
    {
        goto stateFail;
    }
    if (_1 == haystack[index])
    {
        ++index;
        goto state2;
    }
    goto stateFail;

    // match \1{2,*}$ -- usually optimized away because it always succeeds
state1:
    if (index >= haystack.Length) 
    {
        goto stateSuccess;
    }
    if (_1 == haystack[index])
    {
        ++index;
        goto state2;
    }
    goto stateSuccess;

stateSuccess:
    return true;

stateFail:
    return false;
}
Run Code Online (Sandbox Code Playgroud)

那么什么更快呢?

嗯,这要看情况。从表达式确定 NFA/DFA、编译程序以及每次调用查找程序并对其求值都会产生开销。对于非常简单的情况,anEndsWith胜过正则表达式。在本例中,是 'OR' 使其EndsWith比正则表达式慢。

另一方面,正则表达式通常是您多次使用的东西,这意味着您只需编译它一次,并且只需在每次调用时查找它即可。