用于删除字符串中重复字符的函数

Question

用于删除字符串中重复字符的函数

以下代码尝试删除字符串中的任何重复字符.我不确定代码是否正确.任何人都可以帮我处理代码(即当字符匹配时实际发生了什么)？

public static void removeDuplicates(char[] str) {
  if (str == null) return;
  int len = str.length;
  if (len < 2) return;
  int tail = 1;
  for (int i = 1; i < len; ++i) {
    int j;
    for (j = 0; j < tail; ++j) {
      if (str[i] == str[j]) break;
    }
    if (j == tail) {
      str[tail] = str[i];
      ++tail;
    }
  }
  str[tail] = 0;
}

Run Code Online (Sandbox Code Playgroud)

Answer 1

cod*_*ict 43

这个功能对我来说很好看.我写过内联评论.希望能帮助到你:

// function takes a char array as input.
// modifies it to remove duplicates and adds a 0 to mark the end
// of the unique chars in the array.
public static void removeDuplicates(char[] str) {
  if (str == null) return; // if the array does not exist..nothing to do return.
  int len = str.length; // get the array length.
  if (len < 2) return; // if its less than 2..can't have duplicates..return.
  int tail = 1; // number of unique char in the array.
  // start at 2nd char and go till the end of the array.
  for (int i = 1; i < len; ++i) { 
    int j;
    // for every char in outer loop check if that char is already seen.
    // char in [0,tail) are all unique.
    for (j = 0; j < tail; ++j) {
      if (str[i] == str[j]) break; // break if we find duplicate.
    }
    // if j reachs tail..we did not break, which implies this char at pos i
    // is not a duplicate. So we need to add it our "unique char list"
    // we add it to the end, that is at pos tail.
    if (j == tail) {
      str[tail] = str[i]; // add
      ++tail; // increment tail...[0,tail) is still "unique char list"
    }
  }
  str[tail] = 0; // add a 0 at the end to mark the end of the unique char.
}

Run Code Online (Sandbox Code Playgroud)

代码不好; 如果没有任何欺骗,最后一行会导致`ArrayIndexOutOfBoundsException`. (18认同)
太晚了,但最后一个括号前的最后一行:**[if(tail <len)str [tail] = 0;]** (2认同)

Answer 2

pol*_*nts 33

你的代码是,我很遗憾地说,非常像C.

Java String不是char[].你说你想从a中删除重复项String,但是你需要一个char[].

这是char[] \0终止的吗？看起来不像是因为你占用了整个.length数组.但是,您的算法会尝试\0终止数组的一部分.如果数组不包含重复项,会发生什么？

好吧,正如它所写,你的代码实际上抛出ArrayIndexOutOfBoundsException了最后一行!没有空间,\0因为所有的插槽都用完了!

您可以\0在此例外情况下添加一个不添加的检查,但那么您打算如何使用此代码呢？你打算有一个类似strlen函数来找到\0数组中的第一个吗？如果没有,会发生什么？(由于上面所有独特的例外情况？).

如果原件String/ char[]包含\0？会发生什么？(顺便说一句,这在Java中是完全合法的,请参阅JLS 10.9字符数组不是字符串)

结果将是一团糟,所有这些都是因为你想要做所有类似C的事情,并且没有任何额外的缓冲区.你确定你真的需要这样做吗？为什么不一起工作String,indexOf,lastIndexOf,replace,和所有的更高级别的API String？它可证明是太慢了,还是只是怀疑它是？

"过早优化是所有邪恶的根源".我很抱歉,如果你甚至不能理解原始代码的作用,那么弄清楚它将如何适应更大(更混乱)的系统将是一场噩梦.

我的最小建议是做以下事情:

使函数接受并返回a String,即public static String removeDuplicates(String in)
在内部,与...一起工作 char[] str = in.toCharArray();
替换最后一行 return new String(str, 0, tail);

这确实使用了额外的缓冲区,但至少与系统其余部分的接口更加清晰.

或者,您可以这样使用StringBuilder:

static String removeDuplicates(String s) {
    StringBuilder noDupes = new StringBuilder();
    for (int i = 0; i < s.length(); i++) {
        String si = s.substring(i, i + 1);
        if (noDupes.indexOf(si) == -1) {
            noDupes.append(si);
        }
    }
    return noDupes.toString();
}

Run Code Online (Sandbox Code Playgroud)

请注意,这与您拥有的算法基本相同,但更清晰,没有太多小角落情况等.

使用StringBuilder的解决方案肯定更好,但不在问题的范围内.设计算法并编写代码以删除字符串中的重复字符,而无需使用任何其他缓冲区.注意:一个或两个额外的变量是好的.数组的额外副本不是. (11认同)
@polygene为什么在使用charAt()时使用substring()？ (4认同)

Answer 3

Dhr*_*ola 16

鉴于以下问题:

编写代码以删除字符串中的重复字符,而不使用任何其他缓冲区.注意:一个或两个额外的变量是好的.数组的额外副本不是.

由于一个或两个附加变量很好但不允许缓冲区,因此可以通过使用整数来存储位来模拟散列映射的行为.这个简单的解决方案运行在O(n),这比你的更快.此外,它在概念上并不复杂和就地:

    public static void removeDuplicates(char[] str) {
        int map = 0;
        for (int i = 0; i < str.length; i++) {
            if ((map & (1 << (str[i] - 'a'))) > 0) // duplicate detected
                str[i] = 0;
            else // add unique char as a bit '1' to the map
                map |= 1 << (str[i] - 'a');
        }
    }

Run Code Online (Sandbox Code Playgroud)

缺点是重复项(用0替换)不会放在str []数组的末尾.但是,这可以通过最后一次循环遍历数组来轻松解决.此外,整数只能容纳常规字母.

我真的很喜欢这个解决方案,非常喜欢.从概念上讲,这是一个很好的解决方案,它不适用于正常情况.它只能在一个区分大小写的情况下工作.没有空格支持.这将在256位系统中处理奇迹,以处理整个ASCII范围.但它确实在O(N)中运行.我仍然为这一个+1. (2认同)

Answer 4

小智 10

private static String removeDuplicateCharactersFromWord(String word) {

    String result = new String("");

    for (int i = 0; i < word.length(); i++) {
        if (!result.contains("" + word.charAt(i))) {
            result += "" + word.charAt(i);
        }
    }

    return result;
}

Run Code Online (Sandbox Code Playgroud)

Answer 5

Mar*_*sic 7

这是我对O(n ^ 2)复杂度问题的解决方案.这个想法与书中的想法大致相同,但我相信我的解决方案更易读,更容易理解算法本身:

public static void removeDuplicates(char[] str) {
        // if string has less than 2 characters, it can't contain
        // duplicate values, so there's nothing to do
        if (str == null || str.length < 2) {
            return;
        }

        // variable which indicates the end of the part of the string 
        // which is 'cleaned' (all duplicates removed)
        int tail = 0;

        for (int i = 0; i < str.length; i++) {
            boolean found = false;

            // check if character is already present in
            // the part of the array before the current char
            for (int j = 0; j < i; j++) {
                if (str[j] == str[i]) {
                    found = true;
                    break;
                }
            }

            // if char is already present
            // skip this one and do not copy it
            if (found) {
                continue;
            }

            // copy the current char to the index 
            // after the last known unique char in the array
            str[tail] = str[i];
            tail++;
        }

        str[tail] = '\0';
    }

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，9 月前
查看次数：	103739 次
最近记录：	6 年，6 月前