Mar*_*sen 38 c++ performance c++11
我正在创建一种机制,允许用户使用装饰器模式从基本构建块中形成任意复杂的函数.这可以很好地实现功能,但我不喜欢它涉及大量虚拟调用的事实,特别是当嵌套深度变大时.它让我担心,因为复杂的功能可能经常调用(> 100.000倍).
为了避免这个问题,我尝试将装饰器方案转换为std::function一次完成(to_function()在SSCCE中为cfr.).所有内部函数调用都是在构建期间连线的std::function.我认为这比原始装饰器方案评估更快,因为在std::function版本中不需要执行虚拟查找.
唉,基准测试证明我错了:装饰器方案实际上比std::function我建造的更快.所以现在我想知道为什么.也许我的测试设置有问题,因为我只使用两个简单的基本函数,这意味着可以缓存vtable查找?
我使用的代码包含在下面,不幸的是它很长.
// sscce.cpp
#include <iostream>
#include <vector>
#include <memory>
#include <functional>
#include <random>
/**
* Base class for Pipeline scheme (implemented via decorators)
*/
class Pipeline {
protected:
std::unique_ptr<Pipeline> wrappee;
Pipeline(std::unique_ptr<Pipeline> wrap)
:wrappee(std::move(wrap)){}
Pipeline():wrappee(nullptr){}
public:
typedef std::function<double(double)> FnSig;
double operator()(double input) const{
if(wrappee.get()) input=wrappee->operator()(input);
return process(input);
}
virtual double process(double input) const=0;
virtual ~Pipeline(){}
// Returns a std::function which contains the entire Pipeline stack.
virtual FnSig to_function() const=0;
};
/**
* CRTP for to_function().
*/
template <class Derived>
class Pipeline_CRTP : public Pipeline{
protected:
Pipeline_CRTP(const Pipeline_CRTP<Derived> &o):Pipeline(o){}
Pipeline_CRTP(std::unique_ptr<Pipeline> wrappee)
:Pipeline(std::move(wrappee)){}
Pipeline_CRTP():Pipeline(){};
public:
typedef typename Pipeline::FnSig FnSig;
FnSig to_function() const override{
if(Pipeline::wrappee.get()!=nullptr){
FnSig wrapfun = Pipeline::wrappee->to_function();
FnSig processfun = std::bind(&Derived::process,
static_cast<const Derived*>(this),
std::placeholders::_1);
FnSig fun = [=](double input){
return processfun(wrapfun(input));
};
return std::move(fun);
}else{
FnSig processfun = std::bind(&Derived::process,
static_cast<const Derived*>(this),
std::placeholders::_1);
FnSig fun = [=](double input){
return processfun(input);
};
return std::move(fun);
}
}
virtual ~Pipeline_CRTP(){}
};
/**
* First concrete derived class: simple scaling.
*/
class Scale: public Pipeline_CRTP<Scale>{
private:
double scale_;
public:
Scale(std::unique_ptr<Pipeline> wrap, double scale) // todo move
:Pipeline_CRTP<Scale>(std::move(wrap)),scale_(scale){}
Scale(double scale):Pipeline_CRTP<Scale>(),scale_(scale){}
double process(double input) const override{
return input*scale_;
}
};
/**
* Second concrete derived class: offset.
*/
class Offset: public Pipeline_CRTP<Offset>{
private:
double offset_;
public:
Offset(std::unique_ptr<Pipeline> wrap, double offset) // todo move
:Pipeline_CRTP<Offset>(std::move(wrap)),offset_(offset){}
Offset(double offset):Pipeline_CRTP<Offset>(),offset_(offset){}
double process(double input) const override{
return input+offset_;
}
};
int main(){
// used to make a random function / arguments
// to prevent gcc from being overly clever
std::default_random_engine generator;
auto randint = std::bind(std::uniform_int_distribution<int>(0,1),std::ref(generator));
auto randdouble = std::bind(std::normal_distribution<double>(0.0,1.0),std::ref(generator));
// make a complex Pipeline
std::unique_ptr<Pipeline> pipe(new Scale(randdouble()));
for(unsigned i=0;i<100;++i){
if(randint()) pipe=std::move(std::unique_ptr<Pipeline>(new Scale(std::move(pipe),randdouble())));
else pipe=std::move(std::unique_ptr<Pipeline>(new Offset(std::move(pipe),randdouble())));
}
// make a std::function from pipe
Pipeline::FnSig fun(pipe->to_function());
double bla=0.0;
for(unsigned i=0; i<100000; ++i){
#ifdef USE_FUNCTION
// takes 110 ms on average
bla+=fun(bla);
#else
// takes 60 ms on average
bla+=pipe->operator()(bla);
#endif
}
std::cout << bla << std::endl;
}
Run Code Online (Sandbox Code Playgroud)
使用pipe:
g++ -std=gnu++11 sscce.cpp -march=native -O3
sudo nice -3 /usr/bin/time ./a.out
-> 60 ms
Run Code Online (Sandbox Code Playgroud)
使用fun:
g++ -DUSE_FUNCTION -std=gnu++11 sscce.cpp -march=native -O3
sudo nice -3 /usr/bin/time ./a.out
-> 110 ms
Run Code Online (Sandbox Code Playgroud)
Seb*_*edl 23
你有std::function绑定lambdas调用std::functions绑定lamdbas调用std::functions ...
看看你的to_function.它创建一个调用两个std::functions 的lambda,并将lambda绑定到另一个std::function.编译器不会静态解析任何这些.
所以最后,你会得到与虚函数解决方案一样多的间接调用,如果你摆脱了边界processfun并直接在lambda中调用它.否则你有两倍的数量.
如果您想要加速,您将必须以可以静态解析的方式创建整个管道,这意味着在最终将类型擦除为单个之前需要更多模板std::function.
Jon*_*ely 18
正如Sebastian Redl的回答所说,虚拟函数的"替代"通过动态绑定函数(虚拟或通过函数指针,根据std::function实现)添加了几层间接,然后它仍然调用虚Pipeline::process(double)函数!
通过删除一层std::function间接并防止调用为Derived::process虚拟,此修改使其显着更快:
FnSig to_function() const override {
FnSig fun;
auto derived_this = static_cast<const Derived*>(this);
if (Pipeline::wrappee) {
FnSig wrapfun = Pipeline::wrappee->to_function();
fun = [=](double input){
return derived_this->Derived::process(wrapfun(input));
};
} else {
fun = [=](double input){
return derived_this->Derived::process(input);
};
}
return fun;
}
Run Code Online (Sandbox Code Playgroud)
这里还有比虚拟功能版本更多的工作.
std::function是出了名的慢; 类型擦除和由此产生的分配在此中起作用,同时gcc,调用被内联/优化得非常严重.出于这个原因,人们试图解决这个问题的过程中存在大量的C++"代理人".我把一个移植到Code Review:
https://codereview.stackexchange.com/questions/14730/impossibly-fast-delegate-in-c11
但你可以在Google上找到很多其他人,或者自己编写.
编辑:
这些天,请看这里快速代表.
std :: function的libstdc ++实现大致如下:
template<typename Signature>
struct Function
{
Ptr functor;
Ptr functor_manager;
template<class Functor>
Function(const Functor& f)
{
functor_manager = &FunctorManager<Functor>::manage;
functor = new Functor(f);
}
Function(const Function& that)
{
functor = functor_manager(CLONE, that->functor);
}
R operator()(args) // Signature
{
return functor_manager(INVOKE, functor, args);
}
~Function()
{
functor_manager(DESTROY, functor);
}
}
template<class Functor>
struct FunctorManager
{
static manage(int operation, Functor& f)
{
switch (operation)
{
case CLONE: call Functor copy constructor;
case INVOKE: call Functor::operator();
case DESTROY: call Functor destructor;
}
}
}
Run Code Online (Sandbox Code Playgroud)
因此,虽然std::function不知道Functor对象的确切类型,但它通过functor_manager函数指针调度重要操作,该函数指针是知道Functor类型的模板实例的静态函数.
每个std::function实例都会在堆上分配它自己拥有的仿函数对象副本(除非它不大于指针,例如函数指针,在这种情况下它只是将指针保存为子对象).
重要的是,std::function如果底层仿函数对象具有昂贵的复制构造函数和/或占用大量空间(例如保存绑定参数),则复制是昂贵的.
| 归档时间: |
|
| 查看次数: |
8779 次 |
| 最近记录: |