在 Rust 中枚举字符串的最佳方法是什么？（chars() 与 as_bytes()）

Question

在 Rust 中枚举字符串的最佳方法是什么？（chars() 与 as_bytes()）

我是 Rust 新手，我正在使用 Rust Book 来学习它。

最近，我在那里发现了这个功能：

// Returns the number of characters in the first
// word of the given string

fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return i;
        }
    }

    s.len()
}

Run Code Online (Sandbox Code Playgroud)

如您所见，作者在这里使用 String::as_bytes() 方法来枚举字符串。然后，他们将 char ' ' 转换为 u8 类型，以检查我们是否已到达第一个单词的末尾。

据我所知，还有另一种选择，看起来更好：

fn first_word(s: &String) -> usize {
    for (i, item) in s.chars().enumerate() {
        if item == ' ' {
            return i;
        }
    }
    s.len()
}

Run Code Online (Sandbox Code Playgroud)

在这里，我使用 String::chars() 方法，该函数看起来更干净。

那么问题是：这两件事有什么区别吗？如果是这样，哪一个更好，为什么？

Answer 1

eff*_*ect 5

如果您的字符串恰好是纯 ASCII（其中每个字符只有一个字节），则这两个函数的行为应该相同。

然而，Rust 被设计为支持 UTF8 字符串，其中单个字符可以由多个字节组成，因此s.chars()应该首选 using，如果字符串中有非 ascii 字符，它将允许您的函数仍然按预期工作。

正如 @eggyal 指出的，Rust 有一个str::split_whitespace方法，它返回单词上的迭代器，并且该方法将分割所有空白（而不仅仅是空格）。你可以像这样使用它：

fn first_word(s: &String) -> usize {
    if let Some(word) = s.split_whitespace().next() {
        word.len()
    }
    else {
       s.len() 
    }
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	2 年，10 月前
查看次数：	1013 次
最近记录：	2 年，10 月前