我需要做以下事情:
static string[] pats = { "å", "Å", "æ", "Æ", "ä", "Ä", "ö", "Ö", "ø", "Ø" ,"è", "È", "à", "À", "ì", "Ì", "õ", "Õ", "ï", "Ï" };
static string[] repl = { "a", "A", "a", "A", "a", "A", "o", "O", "o", "O", "e", "E", "a", "A", "i", "I", "o", "O", "i", "I" };
static int i = pats.Length;
int j;
// function for the replacement(s)
public string DoRepl(string Inp) {
string tmp = Inp;
for( j = 0; j < i; j++ ) {
tmp = Regex.Replace(tmp,pats[j],repl[j]);
}
return tmp.ToString();
}
/* Main flow processes about 45000 lines of input */
Run Code Online (Sandbox Code Playgroud)
每行有6个元素通过DoRepl.大约300,000个函数调用.每个都有20个Regex.Replace,总计约600万个替换.
是否有更优雅的方式来减少传球?
Jes*_*det 21
static Dictionary<char, char> repl = new Dictionary<char, char>() { { 'å', 'a' }, { 'ø', 'o' } }; // etc...
public string DoRepl(string Inp)
{
var tmp = Inp.Select(c =>
{
char r;
if (repl.TryGetValue(c, out r))
return r;
return c;
});
return new string(tmp.ToArray());
}
Run Code Online (Sandbox Code Playgroud)
每个字符只对字典进行一次检查,如果在字典中找到则替换.
Jon*_*röm 12
这个"伎俩"怎么样?
string conv = Encoding.ASCII.GetString(Encoding.GetEncoding("Cyrillic").GetBytes(input));
Run Code Online (Sandbox Code Playgroud)
Ste*_*ger 10
没有正则表达式可能会更快.
for( j = 0; j < i; j++ )
{
tmp = tmp.Replace(pats[j], repl[j]);
}
Run Code Online (Sandbox Code Playgroud)
编辑
另一种方式使用Zip和StringBuilder:
StringBuilder result = new StringBuilder(input);
foreach (var zipped = patterns.Zip(replacements, (p, r) => new {p, r}))
{
result = result.Replace(zipped.p, zipped.r);
}
return result.ToString();
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6616 次 |
| 最近记录: |