L.B*_*L.B 16
var text = "ÜST";
var unaccentedText = String.Join("", text.Normalize(NormalizationForm.FormD)
.Where(c => char.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark));
Run Code Online (Sandbox Code Playgroud)
ogu*_*gun 13
您可以使用以下方法解决问题.其他方法不能正确转换"土耳其小写I(\ u0131)".
public static string RemoveDiacritics(string text)
{
Encoding srcEncoding = Encoding.UTF8;
Encoding destEncoding = Encoding.GetEncoding(1252); // Latin alphabet
text = destEncoding.GetString(Encoding.Convert(srcEncoding, destEncoding, srcEncoding.GetBytes(text)));
string normalizedString = text.Normalize(NormalizationForm.FormD);
StringBuilder result = new StringBuilder();
for (int i = 0; i < normalizedString.Length; i++)
{
if (!CharUnicodeInfo.GetUnicodeCategory(normalizedString[i]).Equals(UnicodeCategory.NonSpacingMark))
{
result.Append(normalizedString[i]);
}
}
return result.ToString();
}
Run Code Online (Sandbox Code Playgroud)
Umu*_* D. 10
public string TurkishCharacterToEnglish(string text)\n{\n char[] turkishChars = {'\xc4\xb1', '\xc4\x9f', '\xc4\xb0', '\xc4\x9e', '\xc3\xa7', '\xc3\x87', '\xc5\x9f', '\xc5\x9e', '\xc3\xb6', '\xc3\x96', '\xc3\xbc', '\xc3\x9c'};\n char[] englishChars = {'i', 'g', 'I', 'G', 'c', 'C', 's', 'S', 'o', 'O', 'u', 'U'};\n \n // Match chars\n for (int i = 0; i < turkishChars.Length; i++)\n text = text.Replace(turkishChars[i], englishChars[i]);\n\n return text;\n}\nRun Code Online (Sandbox Code Playgroud)\n
我不是这方面的专家,但我认为你可以string.Normalize通过分解值然后有效地删除非ASCII字符来实现它:
using System;
using System.Linq;
using System.Text;
class Test
{
static void Main()
{
string text = "\u00DCST";
string normalized = text.Normalize(NormalizationForm.FormD);
string asciiOnly = new string(normalized.Where(c => c < 128).ToArray());
Console.WriteLine(asciiOnly);
}
}
Run Code Online (Sandbox Code Playgroud)
在某些情况下,这完全有可能是可怕的事情.
| 归档时间: |
|
| 查看次数: |
10987 次 |
| 最近记录: |