Dra*_*Tux 8 c++ gzip iostream protocol-buffers
经过几天试验协议缓冲区后,我试图压缩文件.使用Python,这很简单,并且不需要使用流.
由于我们的大部分代码都是用C++编写的,我想用同一种语言压缩/解压缩文件.我已经尝试过boost gzip库,但无法使其工作(不压缩):
int writeEventCollection(HEP::MyProtoBufClass* protobuf, std::string filename, unsigned int compressionLevel) {
ofstream file(filename.c_str(), ios_base::out | ios_base::binary);
filtering_streambuf<output> out;
out.push(gzip_compressor(compressionLevel));
out.push(file);
if (!protobuf->SerializeToOstream(&file)) {//serialising to wrong stream I asume
cerr << "Failed to write ProtoBuf." << endl;
return -1;
}
return 0;
}
Run Code Online (Sandbox Code Playgroud)
我搜索了使用GzipOutputStream和GzipInputStream与协议缓冲区但没有找到工作示例的示例.
正如您现在可能已经注意到的那样,我最初是一个初学者,并且非常感谢http://code.google.com/apis/protocolbuffers/docs/cpptutorial.html中的完整工作示例 (我有我的地址_本,如何将它保存在gziped文件中?)
先感谢您.
编辑:工作实例.
示例1在StackOverflow上的答案之后
int writeEventCollection(shared_ptr<HEP::EventCollection> eCollection,
std::string filename, unsigned int compressionLevel) {
filtering_ostream out;
out.push(gzip_compressor(compressionLevel));
out.push(file_sink(filename, ios_base::out | ios_base::binary));
if (!eCollection->SerializeToOstream(&out)) {
cerr << "Failed to write event collection." << endl;
return -1;
}
return 0;
}
Run Code Online (Sandbox Code Playgroud)
示例2在Google的Protobuf讨论组上回答以下问题:
int writeEventCollection2(shared_ptr<HEP::EventCollection>
eCollection, std::string filename,
unsigned int compressionLevel) {
using namespace google::protobuf::io;
int filedescriptor = open(filename.c_str(), O_WRONLY | O_CREAT | O_TRUNC,
S_IREAD | S_IWRITE);
if (filedescriptor == -1) {
throw "open failed on output file";
}
google::protobuf::io::FileOutputStream file_stream(filedescriptor);
GzipOutputStream::Options options;
options.format = GzipOutputStream::GZIP;
options.compression_level = compressionLevel;
google::protobuf::io::GzipOutputStream gzip_stream(&file_stream,
options);
if (!eCollection->SerializeToZeroCopyStream(&gzip_stream)) {
cerr << "Failed to write event collection." << endl;
return -1;
}
close(filedescriptor);
return 0;
}
Run Code Online (Sandbox Code Playgroud)
关于性能的一些评论(阅读当前格式和编写ProtoBuf 11146文件):示例1:
real 13m1.185s
user 11m18.500s
sys 0m13.430s
CPU usage: 65-70%
Size of test sample: 4.2 GB (uncompressed 7.7 GB, our current compressed format: 7.7 GB)
Run Code Online (Sandbox Code Playgroud)
例2:
real 12m37.061s
user 10m55.460s
sys 0m11.900s
CPU usage: 90-100%
Size of test sample: 3.9 GB
Run Code Online (Sandbox Code Playgroud)
似乎Google的方法更有效地使用CPU,稍微快一点(虽然我希望它在准确度范围内)并且使用相同的压缩设置产生约7%的小数据集.
您的假设是正确的:您发布的代码不起作用,因为您直接写入ofstream而不是通过filtering_streambuf. 为了使这项工作有效,您可以使用filtering_ostream:
ofstream file(filename.c_str(), ios_base::out | ios_base::binary);
filtering_ostream out;
out.push(gzip_compressor(compressionLevel));
out.push(file);
if (!protobuf->SerializeToOstream(&out)) {
// ... etc.
}
Run Code Online (Sandbox Code Playgroud)
或者更简洁地使用file_sink:
filtering_ostream out;
out.push(gzip_compressor(compressionLevel));
out.push(file_sink(filename, ios_base::out | ios_base::binary));
if (!protobuf->SerializeToOstream(&out)) {
// ... etc.
}
Run Code Online (Sandbox Code Playgroud)
我希望这有帮助!