随机数似乎不是随机的

jkr*_*r01 45 c# random base32

我试图生成6个字符或更少的随机base32数字.这应该给出大约10亿种不同的组合.

我创建了一个程序来生成这些"随机"数字.但是,它似乎每40,000代平均产生一次重复.

当有超过十亿种不同的组合时,为什么这些"随机"数字经常重复?

这是我的代码:

static void Main(string[] args)
{
    int seed = Environment.TickCount;
    Random r = new Random(seed);

    Dictionary<int, int> resultDictionary = new Dictionary<int, int>();
    for (int x = 1; x <= 1000; x++)
    {
        Dictionary<string, int> dictionary = new Dictionary<string, int>();
        try
        {
            while (true)
            {
                int rand = r.Next(0, 1073741823);
                CrockfordBase32Encoding encoding = new CrockfordBase32Encoding();
                string encodedRand = encoding.Encode((ulong)rand, false);
                dictionary.Add(encodedRand, rand);
            }
        }
        catch (Exception)
        {
        }
        Console.WriteLine(string.Format("{0} - {1}", x, dictionary.Count));
        resultDictionary.Add(x, dictionary.Count);
        x++;
    }

    Console.WriteLine();
    Console.WriteLine("Average Number Before Duplicate: " + resultDictionary.Average(x => x.Value));
    Console.ReadLine();
}
Run Code Online (Sandbox Code Playgroud)

D S*_*ley 118

这类似于生日问题.鉴于一群n人,两个人共享同一个生日1的概率是多少?它比你想象的要高.

在你的情况下,随机选择一个介于0和1,073,741,823之间的数字的几率是多少?

上面链接的一个近似值是1-exp(-(n*n)/(2*d)).如果n=40,000这相当于选择副本的概率大约为52.5%,那么在平均40,000个选择之后看到重复似乎是合理的.


1假设生日普遍均匀分布,实际情况并非如此,但"足够接近"并使数学更容易

  • @Falco我的旧MP3播放器将通过构建所有歌曲的列表来"洗牌+重复",对列表进行洗牌,然后循环播放.通过这种方式,一旦我听到一首歌,在所有其他歌曲播放之前,我绝对不会再听到它;)仍然是随机的! (11认同)
  • 第一句话应该是:"随机选择的人都有不同的生日有什么机会?" (10认同)
  • 实际上是一个有趣的心理问题,因为人们期望随机性几乎不会产生相同的结果.生日悖论的一个变种:如果你让人们猜测很多骰子,他们几乎不会连续两次猜出相同的数字,尽管它经常发生. - >如果您为音乐播放器编写一个随机播放+重复播放模式,您必须伪造您的结果并使同一首歌不太可能再次播放,或者对于大多数人来说它不会感觉随机;-) (9认同)
  • @Damon:这取决于你的意思是"非常非均分".Nunnikhoven(http://www.jstor.org/stable/2685309)使用了美国生日分布数据,并显示真实概率在小数点后3位与假设生日均匀分布计算的小概率不同.许多人会发现大约0.1%的误差是可以接受的. (3认同)
  • 我记得有人会从他们的Battle.net身份验证器发布123456或888888的截图,以"证明"结果不是随机的.当然,如果不可能产生这样的数字,那么发电机就不会是随机的.人类只是不了解数字 (2认同)

ang*_*son 41

这被称为生日问题,只是基本的概率论.

1到K范围内的N个随机数给出重复的概率是:

在此输入图像描述
在此输入图像描述

要计算获得至少一个重复的可能性,从1中减去该值.

在你的情况下,它评估为

P(40000, 1073741823) = 1 - p(40000, 1073741823)
Run Code Online (Sandbox Code Playgroud)

通过使用Wolfram Alpha进行计算,结果如下

0.5252888122305790
Run Code Online (Sandbox Code Playgroud)

这意味着你获得重复的可能性略高于50%.随着您生成更多数字,您将越来越多地获得重复数据.

以下是对N的更多评价:

   N      Result
 40000    0.5253
100000    0.9905
200000    0.9999
Run Code Online (Sandbox Code Playgroud)

  • @ user11153如果您编辑文本来描述新公式,那将是很好的,因为图像是*not*具有重复的公式.请记住,当您编辑问题/答案以确保它们之后的意思相同时. (2认同)

Dar*_*rek 5

框架中包含的随机数生成器是伪随机的,没有任何随机数分布的保证。如果您担心分发模式,请考虑以下文章:http : //www.codeproject.com/Articles/15102/NET-random-number-generators-and-distributions

但是,我的统计教授(没有一个)曾经说过:“有一个小谎言,有一个大谎言,而且有统计学”。

首先是完整的代码,因此人们不必在互联网上寻找类实现来进行测试:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            var r = RandomProvider.GetThreadRandom();

            Dictionary<int, int> resultDictionary = new Dictionary<int, int>();
            for (int x = 1; x <= 1000; x++)
            {
                Dictionary<string, int> dictionary = new Dictionary<string, int>();
                try
                {
                    while (true)
                    {
                        int rand = r.Next(0, 1073741823);
                        CrockfordBase32Encoding encoding = new CrockfordBase32Encoding();
                        string encodedRand = encoding.Encode((ulong)rand, false);
                        dictionary.Add(encodedRand, rand);
                    }
                }
                catch (Exception)
                {
                }
                Console.WriteLine("{0} - {1}", x, dictionary.Count);
                resultDictionary.Add(x, dictionary.Count);
                x++;
            }

            Console.WriteLine();
            Console.WriteLine("Average Number Before Duplicate: " + resultDictionary.Average(x => x.Value));
            Console.WriteLine("Minimum Number Before Duplicate: " + resultDictionary.Min(x => x.Value));
            Console.WriteLine("Maximum Number Before Duplicate: " + resultDictionary.Max(x => x.Value));
            Console.WriteLine(" Median Number Before Duplicate: " + resultDictionary.Select(x=>x.Value).Median());
            Console.ReadLine();
        }


    }

    public static class Extensions
    {
        public static double Median<T>(this IEnumerable<T> list)
        {
            List<double> orderedList = list.Select(s=>Convert.ToDouble(s))
                .OrderBy(numbers => numbers)
                .ToList();

            int listSize = orderedList.Count;
            double result;

            if (listSize % 2 == 0) // even
            {
                int midIndex = listSize / 2;
                result = ((orderedList.ElementAt(midIndex - 1) +
                           orderedList.ElementAt(midIndex)) / 2);
            }
            else // odd
            {
                double element = (double)listSize / 2;
                element = Math.Round(element, MidpointRounding.AwayFromZero);

                result = orderedList.ElementAt((int)(element - 1));
            }

            return result;
        } 
    }


    public static class RandomProvider
    {
        private static int seed = Environment.TickCount;

        private static ThreadLocal<Random> randomWrapper = new ThreadLocal<Random>(() =>
            new Random(Interlocked.Increment(ref seed))
            );

        public static Random GetThreadRandom()
        {
            return randomWrapper.Value;
        }
    }

    public class CrockfordBase32Encoding
    {
        const int Base = 32;
        const int CheckDigitBase = 37;

        static readonly IDictionary<int, char> valueEncodings;
        static readonly IDictionary<int, char> checkDigitEncodings;
        static readonly IDictionary<char, int> valueDecodings;
        static readonly IDictionary<char, int> checkDigitDecodings;
        static CrockfordBase32Encoding()
        {
            var symbols = new SymbolDefinitions();
            valueEncodings = symbols.ValueEncodings;
            checkDigitEncodings = symbols.CheckDigitEncodings;
            valueDecodings = symbols.ValueDecodings;
            checkDigitDecodings = symbols.CheckDigitDecodings;
        }

        public string Encode(ulong input, bool includeCheckDigit)
        {
            var chunks = SplitInto5BitChunks(input);
            var characters = chunks.Select(chunk => valueEncodings[chunk]);

            if (includeCheckDigit)
            {
                var checkValue = (int)(input % CheckDigitBase);
                characters = characters.Concat(new[] { checkDigitEncodings[checkValue] });
            }

            return new string(characters.ToArray());
        }

        internal static IEnumerable<byte> SplitInto5BitChunks(ulong input)
        {
            const int bitsPerChunk = 5;
            const int shift = (sizeof(ulong) * 8) - bitsPerChunk;
            var chunks = new List<byte>();
            do
            {
                var lastChunk = input << shift >> shift;
                chunks.Insert(0, (byte)lastChunk);
                input = input >> bitsPerChunk;
            } while (input > 0);
            return chunks;
        }

        public ulong? Decode(string encodedString, bool treatLastCharacterAsCheckDigit)
        {
            if (encodedString == null)
                throw new ArgumentNullException("encodedString");

            if (encodedString.Length == 0)
                return null;

            IEnumerable<char> charactersInReverse = encodedString.Reverse().ToArray();

            int? expectedCheckValue = null;
            if (treatLastCharacterAsCheckDigit)
            {
                var checkDigit = charactersInReverse.First();
                if (!checkDigitDecodings.ContainsKey(checkDigit)) return null;
                expectedCheckValue = checkDigitDecodings[checkDigit];

                charactersInReverse = charactersInReverse.Skip(1);
            }

            ulong number = 0;
            ulong currentBase = 1;
            foreach (var character in charactersInReverse)
            {
                if (!valueDecodings.ContainsKey(character)) return null;

                var value = valueDecodings[character];
                number += (ulong)value * currentBase;

                currentBase *= Base;
            }

            if (expectedCheckValue.HasValue &&
                (int)(number % CheckDigitBase) != expectedCheckValue)
                return null;

            return number;
        }
    }

    internal class SymbolDefinitions : List<SymbolDefinition>
    {
        readonly List<SymbolDefinition> extraCheckDigits = new List<SymbolDefinition>();

        public SymbolDefinitions()
        {
            AddRange(new[]
            {
                new SymbolDefinition { Value = 0, EncodeSymbol = '0', DecodeSymbols = new[] { '0', 'O', 'o' } },
                new SymbolDefinition { Value = 1, EncodeSymbol = '1', DecodeSymbols = new[] { '1', 'I', 'i', 'L', 'l' } },
                new SymbolDefinition { Value = 2, EncodeSymbol = '2', DecodeSymbols = new[] { '2' } },
                new SymbolDefinition { Value = 3, EncodeSymbol = '3', DecodeSymbols = new[] { '3' } },
                new SymbolDefinition { Value = 4, EncodeSymbol = '4', DecodeSymbols = new[] { '4' } },
                new SymbolDefinition { Value = 5, EncodeSymbol = '5', DecodeSymbols = new[] { '5' } },
                new SymbolDefinition { Value = 6, EncodeSymbol = '6', DecodeSymbols = new[] { '6' } },
                new SymbolDefinition { Value = 7, EncodeSymbol = '7', DecodeSymbols = new[] { '7' } },
                new SymbolDefinition { Value = 8, EncodeSymbol = '8', DecodeSymbols = new[] { '8' } },
                new SymbolDefinition { Value = 9, EncodeSymbol = '9', DecodeSymbols = new[] { '9' } },
                new SymbolDefinition { Value = 10, EncodeSymbol = 'A', DecodeSymbols = new[] { 'A', 'a' } },
                new SymbolDefinition { Value = 11, EncodeSymbol = 'B', DecodeSymbols = new[] { 'B', 'b' } },
                new SymbolDefinition { Value = 12, EncodeSymbol = 'C', DecodeSymbols = new[] { 'C', 'c' } },
                new SymbolDefinition { Value = 13, EncodeSymbol = 'D', DecodeSymbols = new[] { 'D', 'd' } },
                new SymbolDefinition { Value = 14, EncodeSymbol = 'E', DecodeSymbols = new[] { 'E', 'e' } },
                new SymbolDefinition { Value = 15, EncodeSymbol = 'F', DecodeSymbols = new[] { 'F', 'f' } },
                new SymbolDefinition { Value = 16, EncodeSymbol = 'G', DecodeSymbols = new[] { 'G', 'g' } },
                new SymbolDefinition { Value = 17, EncodeSymbol = 'H', DecodeSymbols = new[] { 'H', 'h' } },
                new SymbolDefinition { Value = 18, EncodeSymbol = 'J', DecodeSymbols = new[] { 'J', 'j' } },
                new SymbolDefinition { Value = 19, EncodeSymbol = 'K', DecodeSymbols = new[] { 'K', 'k' } },
                new SymbolDefinition { Value = 20, EncodeSymbol = 'M', DecodeSymbols = new[] { 'M', 'm' } },
                new SymbolDefinition { Value = 21, EncodeSymbol = 'N', DecodeSymbols = new[] { 'N', 'n' } },
                new SymbolDefinition { Value = 22, EncodeSymbol = 'P', DecodeSymbols = new[] { 'P', 'p' } },
                new SymbolDefinition { Value = 23, EncodeSymbol = 'Q', DecodeSymbols = new[] { 'Q', 'q' } },
                new SymbolDefinition { Value = 24, EncodeSymbol = 'R', DecodeSymbols = new[] { 'R', 'r' } },
                new SymbolDefinition { Value = 25, EncodeSymbol = 'S', DecodeSymbols = new[] { 'S', 's' } },
                new SymbolDefinition { Value = 26, EncodeSymbol = 'T', DecodeSymbols = new[] { 'T', 't' } },
                new SymbolDefinition { Value = 27, EncodeSymbol = 'V', DecodeSymbols = new[] { 'V', 'v' } },
                new SymbolDefinition { Value = 28, EncodeSymbol = 'W', DecodeSymbols = new[] { 'W', 'w' } },
                new SymbolDefinition { Value = 29, EncodeSymbol = 'X', DecodeSymbols = new[] { 'X', 'x' } },
                new SymbolDefinition { Value = 30, EncodeSymbol = 'Y', DecodeSymbols = new[] { 'Y', 'y' } },
                new SymbolDefinition { Value = 31, EncodeSymbol = 'Z', DecodeSymbols = new[] { 'Z', 'z' } },
            });

            extraCheckDigits.AddRange(new[]
            {
                new SymbolDefinition { Value = 32, EncodeSymbol = '*', DecodeSymbols = new[] { '*' } },
                new SymbolDefinition { Value = 33, EncodeSymbol = '~', DecodeSymbols = new[] { '~' } },
                new SymbolDefinition { Value = 34, EncodeSymbol = '$', DecodeSymbols = new[] { '$' } },
                new SymbolDefinition { Value = 35, EncodeSymbol = '=', DecodeSymbols = new[] { '=' } },
                new SymbolDefinition { Value = 36, EncodeSymbol = 'U', DecodeSymbols = new[] { 'U', 'u' } },
            });
        }

        public IDictionary<int, char> ValueEncodings
        {
            get
            {
                return this.ToDictionary(s => s.Value, s => s.EncodeSymbol);
            }
        }

        public IDictionary<int, char> CheckDigitEncodings
        {
            get
            {
                return this
                    .Union(extraCheckDigits)
                    .ToDictionary(s => s.Value, s => s.EncodeSymbol);
            }
        }

        public IDictionary<char, int> ValueDecodings
        {
            get
            {
                return this
                    .SelectMany(s => s.DecodeSymbols.Select(d => new { s.Value, DecodeSymbol = d }))
                    .ToDictionary(s => s.DecodeSymbol, s => s.Value);
            }
        }

        public IDictionary<char, int> CheckDigitDecodings
        {
            get
            {
                return this
                    .Union(extraCheckDigits)
                    .SelectMany(s => s.DecodeSymbols.Select(d => new { s.Value, DecodeSymbol = d }))
                    .ToDictionary(s => s.DecodeSymbol, s => s.Value);
            }
        }
    }

    internal class SymbolDefinition
    {
        public int Value { get; set; }
        public IEnumerable<char> DecodeSymbols { get; set; }
        public char EncodeSymbol { get; set; }
    }
}
Run Code Online (Sandbox Code Playgroud)

我添加了几条额外的输出线:

Average Number Before Duplicate: 41043.954
Minimum Number Before Duplicate: 2498
Maximum Number Before Duplicate: 127683
 Median Number Before Duplicate: 37860
Run Code Online (Sandbox Code Playgroud)

并不是很有趣,虽然平均值约为40k,但请注意最小和最大值,相隔两个数量级。

随机性不能保证均匀分布。在连续两次掷骰子中,两次掷出4仍然是随机的。以前一生中两次或多次赢得彩票大奖。

如果您需要每个线程更独特的发行版,请参考Jon Skeet 最出色的书中的RandomProvider示例(是的,我是一个狂热的男孩)。

更新

一个小的并行执行重写,因为折磨基于硅的生命形式很有趣:

    static void Main(string[] args)
    {

        ConcurrentDictionary<int, int> resultDictionary = new ConcurrentDictionary<int, int>();
        Parallel.For(0, 1000, x =>
        {
            var r = RandomProvider.GetThreadRandom();
            ConcurrentDictionary<string, int> dictionary = new ConcurrentDictionary<string, int>();
                while (true)
                {
                    int rand = r.Next(0, 1073741823);
                    CrockfordBase32Encoding encoding = new CrockfordBase32Encoding();
                    string encodedRand = encoding.Encode((ulong) rand, false);
                    if (!dictionary.TryAdd(encodedRand, rand)) break;
                }
            Console.WriteLine("{0} - {1}", x, dictionary.Count);
            resultDictionary.TryAdd(x, dictionary.Count);
        });

        Console.WriteLine();
        Console.WriteLine("Average Number Before Duplicate: " + resultDictionary.Average(x => x.Value));
        Console.WriteLine("Minimum Number Before Duplicate: " + resultDictionary.Min(x => x.Value));
        Console.WriteLine("Maximum Number Before Duplicate: " + resultDictionary.Max(x => x.Value));
        Console.WriteLine(" Median Number Before Duplicate: " + resultDictionary.Select(x=>x.Value).Median());
        Console.ReadLine();
    }
Run Code Online (Sandbox Code Playgroud)

结果:

Average Number Before Duplicate: 41826.375
Minimum Number Before Duplicate: 1655
Maximum Number Before Duplicate: 134671
 Median Number Before Duplicate: 39119
Run Code Online (Sandbox Code Playgroud)

更新2

因此,CodeProject文章的作者将其工作发布为NuGet包:

Install-Package Troschuetz.Random
Run Code Online (Sandbox Code Playgroud)

我使用了相同的示例代码来测试不同的生成器:

标准生成器

Average Number Before Duplicate: 40434.148
Minimum Number Before Duplicate: 978
Maximum Number Before Duplicate: 136248
 Median Number Before Duplicate: 38845
Run Code Online (Sandbox Code Playgroud)

ALF发电机

Average Number Before Duplicate: 40395.845
Minimum Number Before Duplicate: 828
Maximum Number Before Duplicate: 125705
 Median Number Before Duplicate: 38042
Run Code Online (Sandbox Code Playgroud)

MT19937发电机

Average Number Before Duplicate: 40478.174
Minimum Number Before Duplicate: 2723
Maximum Number Before Duplicate: 121367
 Median Number Before Duplicate: 38279
Run Code Online (Sandbox Code Playgroud)

XorShift128Generator

Average Number Before Duplicate: 41463.732
Minimum Number Before Duplicate: 878
Maximum Number Before Duplicate: 111206
 Median Number Before Duplicate: 39013.5
Run Code Online (Sandbox Code Playgroud)

所以你有它。享受你值得拥有的东西..

  • 实际上,存在一个[内置随机类中的错误](https://connect.microsoft.com/VisualStudio/feedback/details/634761/system-random-serious-bug),该错误的周期较短超出了基于所使用算法的最初预期。尽管如此,这不是OP问题的原因,而是有趣的信息。 (3认同)