如何将土耳其语字符转换为字符串中的英语字符?

ozg*_*gun 17 c# encoding

string strTurkish ="ÜST";

如何使strTurkish的价值成为"UST"?

L.B*_*L.B 16

var text = "ÜST";
var unaccentedText  = String.Join("", text.Normalize(NormalizationForm.FormD)
        .Where(c => char.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark));
Run Code Online (Sandbox Code Playgroud)

  • 这不会使`ı`正常化.还有其他方法吗? (3认同)
  • `var text = "ÜST"; var unaccentedText = String.Join("", text.Normalize(NormalizationForm.FormD) .Where(c => char.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)).Replace("ı", "i");` //swh (3认同)

ogu*_*gun 13

您可以使用以下方法解决问题.其他方法不能正确转换"土耳其小写I(\ u0131)".

public static string RemoveDiacritics(string text)
{
    Encoding srcEncoding = Encoding.UTF8;
    Encoding destEncoding = Encoding.GetEncoding(1252); // Latin alphabet

    text = destEncoding.GetString(Encoding.Convert(srcEncoding, destEncoding, srcEncoding.GetBytes(text)));

    string normalizedString = text.Normalize(NormalizationForm.FormD);
    StringBuilder result = new StringBuilder();

    for (int i = 0; i < normalizedString.Length; i++)
    {
        if (!CharUnicodeInfo.GetUnicodeCategory(normalizedString[i]).Equals(UnicodeCategory.NonSpacingMark))
        {
            result.Append(normalizedString[i]);
        }
    }

    return result.ToString();
}
Run Code Online (Sandbox Code Playgroud)


Umu*_* D. 10

public string TurkishCharacterToEnglish(string text)\n{\n    char[] turkishChars = {'\xc4\xb1', '\xc4\x9f', '\xc4\xb0', '\xc4\x9e', '\xc3\xa7', '\xc3\x87', '\xc5\x9f', '\xc5\x9e', '\xc3\xb6', '\xc3\x96', '\xc3\xbc', '\xc3\x9c'};\n    char[] englishChars = {'i', 'g', 'I', 'G', 'c', 'C', 's', 'S', 'o', 'O', 'u', 'U'};\n    \n    // Match chars\n    for (int i = 0; i < turkishChars.Length; i++)\n        text = text.Replace(turkishChars[i], englishChars[i]);\n\n    return text;\n}\n
Run Code Online (Sandbox Code Playgroud)\n


Jon*_*eet 7

我不是这方面的专家,但我认为你可以string.Normalize通过分解值然后有效地删除非ASCII字符来实现它:

using System;
using System.Linq;
using System.Text;

class Test
{
    static void Main()
    {
        string text = "\u00DCST";
        string normalized = text.Normalize(NormalizationForm.FormD);
        string asciiOnly = new string(normalized.Where(c => c < 128).ToArray());
        Console.WriteLine(asciiOnly);
    }    
}
Run Code Online (Sandbox Code Playgroud)

在某些情况下,这完全有可能是可怕的事情.