相关疑难解决方法(0)

#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <chrono>
#include <random>
#include <exception>
#include <type_traits>
#include <boost/lexical_cast.hpp>

using namespace std;

// 1. A way to easily measure elapsed time -------------------
template<typename TimeT = std::chrono::milliseconds>
struct measure
{
    template<typename F>
    static typename TimeT::rep execution(F const &func)
    {
        auto start = std::chrono::system_clock::now();
        func();
        auto duration = std::chrono::duration_cast< TimeT>(
            std::chrono::system_clock::now() - start);
        return duration.count();
    }
};
// -----------------------------------------------------------

// 2. Define the convertion functions …

Run Code Online (Sandbox Code Playgroud)

c++ boost stl c++11

Nik*_*iou

2017 05-23

23
推荐指数

1
解决办法

3300
查看次数

我用boost :: spirit为此编写了一个解析器.解析器正确地存储标题行和下面的文本序列,std::vector< std::pair< string, string >>但是对于较大的文件(对于100MB文件为17秒)需要很长时间.作为比较,我编写了一个没有boost :: spirit(只是STL函数)的程序,只需复制一个100MB文件的每一行std::vector.整个过程不到一秒钟.用于比较的"程序"不符合目的,但我不认为解析器应该花更长的时间......

我知道有很多其他的FASTA解析器,但我很好奇为什么我的代码很慢.

.hpp文件:

#include <boost/filesystem/path.hpp>

namespace fs = boost::filesystem;


class FastaReader {

public:
    typedef std::vector< std::pair<std::string, std::string> > fastaVector;

private:
    fastaVector fV;
    fs::path file;  

public:
    FastaReader(const …

Run Code Online (Sandbox Code Playgroud)

c++ parsing boost boost-spirit-qi c++11

Mar*_*ler

2015 07-10

4
推荐指数

4
解决办法

789
查看次数

如何在32位系统上读取4GB文件

在我的情况下,我有不同的文件让我们假设我有4GB文件的数据.我想逐行读取该文件并处理每一行.我的一个限制是软件必须在32位MS Windows上运行,或者在64位上运行少量RAM(最小4GB).您还可以假设这些行的处理不是瓶颈.

在当前的解决方案中,我读取该文件ifstream并复制到某个字符串.这是片段的样子.

std::ifstream file(filename_xml.c_str());
uintmax_t m_numLines = 0;
std::string str;
while (std::getline(file, str))
{
    m_numLines++;
}

Run Code Online (Sandbox Code Playgroud)

好的,这是有效的,但在这里慢慢地是我的3.6 GB数据的时间:

real    1m4.155s
user    0m0.000s
sys     0m0.030s

Run Code Online (Sandbox Code Playgroud)

我正在寻找一种比这更快的方法,例如我发现如何快速解析C++中空格分隔的浮点数？我喜欢用boost :: mapped_file提出解决方案,但我遇到了另一个问题,如果我的文件是大的,在我的情况下文件1GB大到足以放弃整个过程.我不得不关心内存中的当前数据,可能使用该工具的人的RAM安装量不超过4 GB.

所以我发现了来自boost的mapped_file但是在我的情况下如何使用它？是否可以部分读取该文件并接收这些行？

也许你有另一个更好的解决方案.我必须处理每一行.

谢谢,
巴特

c++ boost 32-bit data-processing large-files

bio*_*oky

2017 05-23

3
推荐指数

2
解决办法

1727
查看次数