编辑:好的,因为很明显我采取了错误的方法,我将解释我打算做什么.总体意图是(作为练习)根据规范验证所有有效的电子邮件地址.该部分用于生成数据集的一部分以验证算法.
作为练习,我正在编写一个程序,它将生成所有可能的电子邮件地址.这将导致80个81 65 ≈1.4e122可能的项目.我目前正在使用List<T>s来存储生成的项目,但我的理解是它的最大容量为Int32.MaxValue.我猜一个正确的解决方案不会涉及List到Lists的Lists.这就是我到目前为止所拥有的.
private void GenerateLocalPart()
{
List<string> validLocalSymbols = new List<string>()
{
".", "!", "#", "$", "%", "&", "*", "+", "-",
"/", "^", "_", "`", "{", "|", "}", "~", "\"",
};
List<string> validLocalNumbers = new List<string>()
{
"0", "1", "2", "3", "4", "5", "6", "7", "8", "9",
};
List<string> validLocalLowercase = new List<string>()
{
"a", "b", "c", "d", "e", "f", "g", "h", "i", "j",
"k", "l", "m", "n", "o", "p", "q", "r", "s", "t",
"u", "v", "w", "x", "y", "z",
};
List<string> validLocalUppercase = new List<string>()
{
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J",
"K", "L", "M", "N", "O", "P", "Q", "R", "S", "T",
"U", "V", "W", "X", "Y", "Z",
};
List<string> validLocalPartCharacters = new List<string>();
validLocalPartCharacters.AddRange(validLocalSymbols);
validLocalPartCharacters.AddRange(validLocalNumbers);
validLocalPartCharacters.AddRange(validLocalLowercase);
validLocalPartCharacters.AddRange(validLocalUppercase);
List<string> targetSequence = validLocalLowercase;
int lengthOfStringToGenerate = 5;
int numberOfDifferentSourceCharacters = targetSequence.Count;
List<List<string>> localPart = new List<List<string>>();
List<string> localPartSeed = new List<string>();
localPart.Add(localPartSeed);
foreach (string character in targetSequence)
localPartSeed.Add(character);
for (int i = 1; i < lengthOfStringToGenerate; i++)
{
List<string> bufferList = new List<string>();
localPart.Add(bufferList);
foreach (string lastListString in localPart[i - 1])
foreach (string character in targetSequence)
bufferList.Add(lastListString + character);
}
Console.WriteLine("Break here.");
}
Run Code Online (Sandbox Code Playgroud)
lengthOfStringToGenerate是字符串的最大长度(因此它生成从1到1的所有组合lengthOfStringToGenerate).localPart最终将获得List相当于的金额lengthOfStringToGenerate.我应该使用不同类型的收藏吗?我应该采取不同的整体方法吗?
Jon*_*eet 15
您希望在哪里存储所有这些数据?List<T>将始终将其值存储在内存中...但即使您写了一些内容来将结果存储到磁盘,您仍然无法容纳1.4e122项.你真的接受了这个数字有多大吗?即使每个项目只有一个比特,如果整个宇宙都是一个大硬盘,那么这比宇宙的容量还要多.
我听说过以最有意义的方式谈论的最大数据单元是exabyte,即10 18字节.对于大多数人来说,1 PB(10 15字节)是非常大量的数据.您正在考虑的是这些数量在微观上看起来很小.
你之后想要对数据做什么?什么时候你会期望这样的算法真正完成?