如何将字符串拆分为每个字符的单位

wol*_*n98 3 rust

我想取一个字符串,其中有可能重复的字符,并将字符串拆分为每个字符的单位。

所以例如

aaaabbbabbbaaaacccbbbbbbbbaaa
Run Code Online (Sandbox Code Playgroud)

会成为

[ aaaa, bbb, a, bbb, aaaa, ccc, bbbbbbbb, aaa ]
Run Code Online (Sandbox Code Playgroud)

She*_*ter 7

一种简洁的方法是Itertools::group_bychars的迭代器上使用:

extern crate itertools;

use itertools::Itertools;

fn main() {
    let input = "aaaabbbabbbaaaacccbbbbbbbbaaa";

    let output: Vec<String> = input
        .chars()
        .group_by(|&x| x)
        .into_iter()
        .map(|(_, r)| r.collect())
        .collect();

    assert_eq!(
        output,
        ["aaaa", "bbb", "a", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
    );
}
Run Code Online (Sandbox Code Playgroud)

但是,这需要Strings为每组字符创建新的。更有效的解决方案是将切片返回到原始字符串。

对先前解决方案的(hacky)修改产生以下结果:

let mut start = input;
let output: Vec<&str> = input
    .chars()
    .group_by(|&x| x)
    .into_iter()
    .map(|(_, r)| {
        let len: usize = r.map(|c| c.len_utf8()).sum();
        let (a, b) = start.split_at(len);
        start = b;
        a
    })
    .collect();
Run Code Online (Sandbox Code Playgroud)


Fre*_*ios 5

如果您认为外部工具太过分了,您可以这样做:

\n\n
fn group_chars(mut input: &str) -> Vec<&str> {\n    fn first_different(mut chars: std::str::Chars) -> Option<usize> {\n        chars.next().map(|f| chars.take_while(|&c| c == f).fold(f.len_utf8(), |len, c| len + c.len_utf8()))\n    }\n\n    let mut output = Vec::new();\n\n    while let Some(different) = first_different(input.chars()) {\n        let (before, after) = input.split_at(different);\n        input = after;\n        output.push(before);\n    }\n\n    output\n}\n\nfn main() {\n    assert_eq!(\n        group_chars("aaaabbb\xc3\xa9bbbaaaacccbbbbbbbbaaa"),\n        ["aaaa", "bbb", "\xc3\xa9", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]\n    );\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

或者你可以做一个迭代器:

\n\n
pub struct CharGroups<\'a> {\n    input: &\'a str,\n}\n\nimpl<\'a> CharGroups<\'a> {\n    pub fn new(input: &\'a str) -> CharGroups<\'a> {\n        CharGroups { input }\n    }\n}\n\nimpl<\'a> Iterator for CharGroups<\'a> {\n    type Item = &\'a str;\n\n    fn next(&mut self) -> Option<&\'a str> {\n        self.input.chars().next().map(|f| {\n            let i = self.input.find(|c| c != f).unwrap_or(self.input.len());\n            let (before, after) = self.input.split_at(i);\n            self.input = after;\n            before\n        })\n    }\n}\n\nfn main() {\n    assert_eq!(\n        CharGroups::new("aaaabbb\xc3\xa9bbbaaaacccbbbbbbbbaaa").collect::<Vec<_>>(),\n        ["aaaa", "bbb", "\xc3\xa9", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]\n    );\n}\n
Run Code Online (Sandbox Code Playgroud)\n