从C#字符串中删除字符

wan*_*wan 138 .net c#

我如何从字符串中删除字符?例如:"My name @is ,Wan.;'; Wan".

我想删除该'@', ',', '.', ';', '\''字符串中的字符,以便它成为"My name is Wan Wan"

Alb*_*nbo 168

var str = "My name @is ,Wan.;'; Wan";
var charsToRemove = new string[] { "@", ",", ".", ";", "'" };
foreach (var c in charsToRemove)
{
    str = str.Replace(c, string.Empty);
}
Run Code Online (Sandbox Code Playgroud)

但是如果你想删除所有非字母字符,我可能会建议另一种方法

var str = "My name @is ,Wan.;'; Wan";
str = new string((from c in str
                  where char.IsWhiteSpace(c) || char.IsLetterOrDigit(c)
                  select c
       ).ToArray());
Run Code Online (Sandbox Code Playgroud)

  • 也可以这样做,str = new string(str.Where(x => char.IsWhiteSpace(x)|| char.IsLetterOrDigit(x)).ToArray()); (10认同)
  • 我是唯一一个得到"参数2:无法从'字符串'转换为'字符'"字符串的字符串.Empty? (5认同)
  • @OddDev 如果您循环遍历的数组是字符列表,您应该只会收到此错误。如果它们是字符串,这应该可以工作 (2认同)
  • 另请注意,要使"str.Replace"函数正常工作,如果要将string.Empty用作第二个参数,则第一个参数必须是"字符串".如果使用char(即'a')作为第一个参数,则还需要char作为第二个参数.否则,您将获得@OddDev上面提到的"参数2:无法从'字符串'转换为'字符'"错误 (2认同)

Eni*_*ity 60

简单:

String.Join("", "My name @is ,Wan.;'; Wan".Split('@', ',' ,'.' ,';', '\''));
Run Code Online (Sandbox Code Playgroud)

  • 可读性并不令人惊讶,但它似乎确实是这里性能最高的解决方案。请参阅[评论](/sf/answers/3401345531/) (3认同)
  • 或者将空字符串上的 `Join` 替换为 `Concat`: `string.Concat("My name @is ,Wan.;'; Wan".Split('@', ',' ,'.' ,';' , '\''));` (2认同)

Joh*_*lle 59

听起来像RegEx的理想应用程序 - 一个专为快速文本操作而设计的引擎.在这种情况下:

Regex.Replace("He\"ll,o Wo'r.ld", "[@,\\.\";'\\\\]", string.Empty)
Run Code Online (Sandbox Code Playgroud)

  • 这并不比循环快,这是一个常见的误解,即正则表达式总是比循环更快.正则表达式不是魔术,在它们的核心,它们必须在某些时候遍历字符串来执行它们的操作,并且它们可以比正则表达式本身的开销慢得多.当涉及极其复杂的操作时,它们确实非常出色,需要几十行代码和多个循环.针对简单的未优化循环50000次测试此正则表达式的编译版本,正则表达式慢6倍. (10认同)
  • 看起来像这样会比基于迭代器的方法更有效,特别是如果你可以使用编译的正则表达式; (3认同)
  • 也许当我断言 RegEx 很快时我说错了。除非这是一个非常紧凑的循环的中心,否则其他考虑因素,这样的可读性和可维护性可能会在像这样的小操作中支配性能。 (2认同)

Thi*_*ark 19

对您的问题不太具体,可以通过白色列出正则表达式中可接受的字符来从字符串(空格除外)中删除所有标点符号:

string dirty = "My name @is ,Wan.;'; Wan";

// only space, capital A-Z, lowercase a-z, and digits 0-9 are allowed in the string
string clean = Regex.Replace(dirty, "[^A-Za-z0-9 ]", "");
Run Code Online (Sandbox Code Playgroud)

请注意,在9之后有一个空格,以免从句子中删除空格.第三个参数是一个空字符串,用于替换不属于正则表达式的任何子字符串.


小智 18

 string x = "My name @is ,Wan.;'; Wan";
 string modifiedString = x.Replace("@", "").Replace(",", "").Replace(".", "").Replace(";", "").Replace("'", "");
Run Code Online (Sandbox Code Playgroud)


drz*_*aus 14

比较各种建议(以及在单字符替换与目标的各种大小和位置的上下文中进行比较).

在这种特殊情况下,拆分目标和连接替换(在这种情况下,空字符串)是最快的至少3倍.最终,性能根据替换的数量而有所不同,替换的位置在源和源的大小.#ymmv

结果

(全部结果在这里)

| Test                      | Compare | Elapsed                                                            |
|---------------------------|---------|--------------------------------------------------------------------|
| SplitJoin                 | 1.00x   | 29023 ticks elapsed (2.9023 ms) [in 10K reps, 0.00029023 ms per]   |
| Replace                   | 2.77x   | 80295 ticks elapsed (8.0295 ms) [in 10K reps, 0.00080295 ms per]   |
| RegexCompiled             | 5.27x   | 152869 ticks elapsed (15.2869 ms) [in 10K reps, 0.00152869 ms per] |
| LinqSplit                 | 5.43x   | 157580 ticks elapsed (15.758 ms) [in 10K reps, 0.0015758 ms per]   |
| Regex, Uncompiled         | 5.85x   | 169667 ticks elapsed (16.9667 ms) [in 10K reps, 0.00169667 ms per] |
| Regex                     | 6.81x   | 197551 ticks elapsed (19.7551 ms) [in 10K reps, 0.00197551 ms per] |
| RegexCompiled Insensitive | 7.33x   | 212789 ticks elapsed (21.2789 ms) [in 10K reps, 0.00212789 ms per] |
| Regex Insentive           | 7.52x   | 218164 ticks elapsed (21.8164 ms) [in 10K reps, 0.00218164 ms per] |
Run Code Online (Sandbox Code Playgroud)

测试线束(LinqPad)

(注:PerfVs定时扩展我写的)

void test(string title, string sample, string target, string replacement) {
    var targets = target.ToCharArray();

    var tox = "[" + target + "]";
    var x = new Regex(tox);
    var xc = new Regex(tox, RegexOptions.Compiled);
    var xci = new Regex(tox, RegexOptions.Compiled | RegexOptions.IgnoreCase);

    // no, don't dump the results
    var p = new Perf/*<string>*/();
        p.Add(string.Join(" ", title, "Replace"), n => targets.Aggregate(sample, (res, curr) => res.Replace(new string(curr, 1), replacement)));
        p.Add(string.Join(" ", title, "SplitJoin"), n => String.Join(replacement, sample.Split(targets)));
        p.Add(string.Join(" ", title, "LinqSplit"), n => String.Concat(sample.Select(c => targets.Contains(c) ? replacement : new string(c, 1))));
        p.Add(string.Join(" ", title, "Regex"), n => Regex.Replace(sample, tox, replacement));
        p.Add(string.Join(" ", title, "Regex Insentive"), n => Regex.Replace(sample, tox, replacement, RegexOptions.IgnoreCase));
        p.Add(string.Join(" ", title, "Regex, Uncompiled"), n => x.Replace(sample, replacement));
        p.Add(string.Join(" ", title, "RegexCompiled"), n => xc.Replace(sample, replacement));
        p.Add(string.Join(" ", title, "RegexCompiled Insensitive"), n => xci.Replace(sample, replacement));

    var trunc = 40;
    var header = sample.Length > trunc ? sample.Substring(0, trunc) + "..." : sample;

    p.Vs(header);
}

void Main()
{
    // also see https://stackoverflow.com/questions/7411438/remove-characters-from-c-sharp-string

    "Control".Perf(n => { var s = "*"; });


    var text = "My name @is ,Wan.;'; Wan";
    var clean = new[] { '@', ',', '.', ';', '\'' };

    test("stackoverflow", text, string.Concat(clean), string.Empty);


    var target = "o";
    var f = "x";
    var replacement = "1";

    var fillers = new Dictionary<string, string> {
        { "short", new String(f[0], 10) },
        { "med", new String(f[0], 300) },
        { "long", new String(f[0], 1000) },
        { "huge", new String(f[0], 10000) }
    };

    var formats = new Dictionary<string, string> {
        { "start", "{0}{1}{1}" },
        { "middle", "{1}{0}{1}" },
        { "end", "{1}{1}{0}" }
    };

    foreach(var filler in fillers)
    foreach(var format in formats) {
        var title = string.Join("-", filler.Key, format.Key);
        var sample = string.Format(format.Value, target, filler.Value);

        test(title, sample, target, replacement);
    }
}
Run Code Online (Sandbox Code Playgroud)

  • 终于有数字了!干得好@drzaus! (2认同)

Fai*_* S. 7

最简单的方法是使用String.Replace:

String s = string.Replace("StringToReplace", "NewString");
Run Code Online (Sandbox Code Playgroud)


Lee*_*des 7

根据 @drzaus 的性能数据,这是一种使用最快算法的扩展方法。

public static class StringEx
{
    public static string RemoveCharacters(this string s, params char[] unwantedCharacters) 
        => s == null ? null : string.Join(string.Empty, s.Split(unwantedCharacters));
}
Run Code Online (Sandbox Code Playgroud)

用法

var name = "edward woodward!";
var removeDs = name.RemoveCharacters('d', '!');
Assert.Equal("ewar woowar", removeDs); // old joke
Run Code Online (Sandbox Code Playgroud)


Pau*_*ndy 6

另一个简单的解决

var forbiddenChars = @"@,.;'".ToCharArray();
var dirty = "My name @is ,Wan.;'; Wan";
var clean = new string(dirty.Where(c => !forbiddenChars.Contains(c)).ToArray());
Run Code Online (Sandbox Code Playgroud)


Mir*_*mvs 5

new List<string> { "@", ",", ".", ";", "'" }.ForEach(m => str = str.Replace(m, ""));
Run Code Online (Sandbox Code Playgroud)


Mas*_*Net 5

这是我编写的一种方法,它采用了略有不同的方法。我没有指定要删除的字符,而是告诉我的方法我想保留哪些字符——它将删除所有其他字符。

在 OP 的示例中,他只想保留字母字符和空格。下面是对我的方法的调用(C# 演示):

var str = "My name @is ,Wan.;'; Wan";

// "My name is Wan Wan"
var result = RemoveExcept(str, alphas: true, spaces: true);
Run Code Online (Sandbox Code Playgroud)

这是我的方法:

/// <summary>
/// Returns a copy of the original string containing only the set of whitelisted characters.
/// </summary>
/// <param name="value">The string that will be copied and scrubbed.</param>
/// <param name="alphas">If true, all alphabetical characters (a-zA-Z) will be preserved; otherwise, they will be removed.</param>
/// <param name="numerics">If true, all alphabetical characters (a-zA-Z) will be preserved; otherwise, they will be removed.</param>
/// <param name="dashes">If true, all alphabetical characters (a-zA-Z) will be preserved; otherwise, they will be removed.</param>
/// <param name="underlines">If true, all alphabetical characters (a-zA-Z) will be preserved; otherwise, they will be removed.</param>
/// <param name="spaces">If true, all alphabetical characters (a-zA-Z) will be preserved; otherwise, they will be removed.</param>
/// <param name="periods">If true, all decimal characters (".") will be preserved; otherwise, they will be removed.</param>
public static string RemoveExcept(string value, bool alphas = false, bool numerics = false, bool dashes = false, bool underlines = false, bool spaces = false, bool periods = false) {
    if (string.IsNullOrWhiteSpace(value)) return value;
    if (new[] { alphas, numerics, dashes, underlines, spaces, periods }.All(x => x == false)) return value;

    var whitelistChars = new HashSet<char>(string.Concat(
        alphas ? "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" : "",
        numerics ? "0123456789" : "",
        dashes ? "-" : "",
        underlines ? "_" : "",
        periods ? "." : "",
        spaces ? " " : ""
    ).ToCharArray());

    var scrubbedValue = value.Aggregate(new StringBuilder(), (sb, @char) => {
        if (whitelistChars.Contains(@char)) sb.Append(@char);
        return sb;
    }).ToString();

    return scrubbedValue;
}
Run Code Online (Sandbox Code Playgroud)