Iro*_*ing 2 c++ algorithm vector duplicates std-pair
我将尝试使用以下示例解释我的问题:
vector<pair<string, string>> a = { { "A","1" }, {"B","2" },{ "C","3" },{ "D","3" },{ "E","5" } };
vector<pair<string, string>> b = { { "A","1" },{ "B","3" },{ "D","3" },{ "E","4" },{ "Z","5" } };
Run Code Online (Sandbox Code Playgroud)
什么是最有效的方法来擦除重复并将输出到相同的向量?对的数量非常大,比如大约10万.
两个向量都按第一个元素排序.
vector<pair<string, string>> a = { { "B","2" },{ "C","3" },{ "E","5" } };
vector<pair<string, string>> b = { { "B","3" },{ "E","4" },{ "Z","5" } };
Run Code Online (Sandbox Code Playgroud)
问题是,我需要在删除重复项后比较这些向量.该对中的第一个元素是文件路径,第二个元素是它的校验和.因此,例如,如果我"B","2"在第一个容器中,并且"B","3"是第二个,我可以将此文件列为"已修改".std::set如果这会使这个问题更容易,我愿意使用.
使用运行索引将为您提供O(len(a)+ len(b))时间复杂度和O(1)额外空间(给定a且b已经排序)
void removeDuplicate(vector<pair<string, string>>& a, vector<pair<string, string>>& b) {
//Add these two lines if there can be duplicates in a or b themselves.
//a.erase(std::unique(a.begin(), a.end()), a.end());
//b.erase(std::unique(b.begin(), b.end()), b.end());
size_t i = 0;
size_t j = 0;
size_t p1 = 0;
size_t p2 = 0;
while(i < a.size() && j < b.size()) {
if(a[i] == b[j]) {
i++;
j++;
} else if (a[i] > b[j]) {
b[p2++] = b[j++];
} else if (b[j] > a[i]) {
a[p1++] = a[i++];
}
}
while(i < a.size()) {
a[p1++] = a[i++];
}
while(j < b.size()) {
b[p2++] = b[j++];
}
a.erase(a.begin()+p1, a.end());
b.erase(b.begin()+p2, b.end());
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
635 次 |
| 最近记录: |