我想取一个字符串,其中有可能重复的字符,并将字符串拆分为每个字符的单位。
所以例如
aaaabbbabbbaaaacccbbbbbbbbaaa
Run Code Online (Sandbox Code Playgroud)
会成为
[ aaaa, bbb, a, bbb, aaaa, ccc, bbbbbbbb, aaa ]
Run Code Online (Sandbox Code Playgroud)
一种简洁的方法是Itertools::group_by在chars的迭代器上使用:
extern crate itertools;
use itertools::Itertools;
fn main() {
let input = "aaaabbbabbbaaaacccbbbbbbbbaaa";
let output: Vec<String> = input
.chars()
.group_by(|&x| x)
.into_iter()
.map(|(_, r)| r.collect())
.collect();
assert_eq!(
output,
["aaaa", "bbb", "a", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]
);
}
Run Code Online (Sandbox Code Playgroud)
但是,这需要Strings为每组字符创建新的。更有效的解决方案是将切片返回到原始字符串。
对先前解决方案的(hacky)修改产生以下结果:
let mut start = input;
let output: Vec<&str> = input
.chars()
.group_by(|&x| x)
.into_iter()
.map(|(_, r)| {
let len: usize = r.map(|c| c.len_utf8()).sum();
let (a, b) = start.split_at(len);
start = b;
a
})
.collect();
Run Code Online (Sandbox Code Playgroud)
如果您认为外部工具太过分了,您可以这样做:
\n\nfn group_chars(mut input: &str) -> Vec<&str> {\n fn first_different(mut chars: std::str::Chars) -> Option<usize> {\n chars.next().map(|f| chars.take_while(|&c| c == f).fold(f.len_utf8(), |len, c| len + c.len_utf8()))\n }\n\n let mut output = Vec::new();\n\n while let Some(different) = first_different(input.chars()) {\n let (before, after) = input.split_at(different);\n input = after;\n output.push(before);\n }\n\n output\n}\n\nfn main() {\n assert_eq!(\n group_chars("aaaabbb\xc3\xa9bbbaaaacccbbbbbbbbaaa"),\n ["aaaa", "bbb", "\xc3\xa9", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]\n );\n}\nRun Code Online (Sandbox Code Playgroud)\n\n或者你可以做一个迭代器:
\n\npub struct CharGroups<\'a> {\n input: &\'a str,\n}\n\nimpl<\'a> CharGroups<\'a> {\n pub fn new(input: &\'a str) -> CharGroups<\'a> {\n CharGroups { input }\n }\n}\n\nimpl<\'a> Iterator for CharGroups<\'a> {\n type Item = &\'a str;\n\n fn next(&mut self) -> Option<&\'a str> {\n self.input.chars().next().map(|f| {\n let i = self.input.find(|c| c != f).unwrap_or(self.input.len());\n let (before, after) = self.input.split_at(i);\n self.input = after;\n before\n })\n }\n}\n\nfn main() {\n assert_eq!(\n CharGroups::new("aaaabbb\xc3\xa9bbbaaaacccbbbbbbbbaaa").collect::<Vec<_>>(),\n ["aaaa", "bbb", "\xc3\xa9", "bbb", "aaaa", "ccc", "bbbbbbbb", "aaa"]\n );\n}\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
942 次 |
| 最近记录: |