我将解析60GB的文本并在地图中进行大量的插入和查找.我刚开始使用boost :: unordered_set和boost :: unordered_map当我的程序开始填充这些容器时,它们开始变得越来越大,我想知道为这些容器预先分配内存是否是一个好主意.像mymap :: get_allocator().allocate(N); ?
或者我应该让他们自己分配并找出成长因素?代码看起来像这样
boost::unordered_map <string,long> words_vs_frequency, wordpair_vs_frequency;
boost::unordered_map <string,float> word_vs_probability, wordpair_vs_probability,
wordpair_vs_MI;
//... ... ...
N = words_vs_frequency.size();
long y =0; float MIWij =0.0f, maxMI=-999999.0f;
for (boost::unordered_map <string,long>::iterator i=wordpair_vs_frequency.begin();
i!=wordpair_vs_frequency.end(); ++i){
if (i->second >= BIGRAM_OCCURANCE_THRESHOLD)
{
y++;
Wij = i->first;
WordPairToWords(Wij, Wi,Wj);
MIWij = log ( wordpair_vs_probability[Wij] /
(word_vs_probability[Wi] * word_vs_probability[Wj])
);
// keeping only the pairs which MI value greater than
if (MIWij > MUTUAL_INFORMATION_THRESHOLD)
wordpair_vs_MI[ Wij ] = MIWij;
if(MIWij > maxMI )
maxMI = MIWij;
}
}
Run Code Online (Sandbox Code Playgroud)
提前致谢
j_r*_*ker 11
根据该文件,都unordered_set和unordered_map有一个方法
void rehash(size_type n);
Run Code Online (Sandbox Code Playgroud)
重新生成哈希表,使其至少包含n桶.(听起来它对reserve()STL容器有什么作用).
| 归档时间: |
|
| 查看次数: |
4378 次 |
| 最近记录: |