带内联的显式模板函数实例化

Question

带内联的显式模板函数实例化

因此,在减少编译时间,将声明与定义分离,并且不影响我编写的用于其他项目的C++数学库的性能时,我和同事一直在讨论显式模板实例化的好处.

基本上我有一个有用的数学函数库,用于处理像Vector3,Vector4,Quaternion等原语.所有这些函数都是用于模板参数是float或double(在某些情况下是int).

所以我不必两次编写这些函数,一次用于浮点数一次为double,函数实现是模板化的,如下所示:

template<typename T>
Vector3<T> foo(const Vector4<T>& a, 
               const Quaternion<T>& b) 
{ do something... }

Run Code Online (Sandbox Code Playgroud)

全部在.h文件中定义(因此它们被隐式标记为内联).这些功能中的大多数都很短,希望在使用编译期间内联.

然而,标题变得越来越大,编译时间越来越多,并且通过浏览标题很难找到函数的存在(这是我喜欢将声明与实现分离的众多原因之一).

所以我可以在附带的.cpp文件中使用显式模板实例化,如下所示:

  //in .h
  template<typename T>
  Vector3<T> foo(const Vector4<T>& a, 
                 const Quaternion<T>& b) 
  { do something... }

  //in .cpp
  template Vector3<float> foo<float>(const Vector4<float>& a, 
                                     const Quaternion<float>& b);
  template Vector3<double> foo<double>(const Vector4<double>& a, 
                                       const Quaternion<double>& b);

Run Code Online (Sandbox Code Playgroud)

这应该有助于编译时间？这是否会影响函数内联的可能性？这些问题中的任何一个问题的答案通常是编译器特定的

另一个好处是它确实验证了函数编译,即使我还没有使用它.

我也可以这样做:

  //in .h
  template<typename T>
  Vector3<T> foo(const Vector4<T>& a, 
                 const Quaternion<T>& b);

  //in .cpp
  template<typename T>
  Vector3<T> foo(const Vector4<T>& a, 
                 const Quaternion<T>& b) 
  { do something... }

  template Vector3<float> foo<float>(const Vector4<float>& a, 
                                     const Quaternion<float>& b);
  template Vector3<double> foo<double>(const Vector4<double>& a, 
                                       const Quaternion<double>& b);

Run Code Online (Sandbox Code Playgroud)

该方法的问题相同:

这应该有助于编译时间？这是否会影响函数内联的可能性？这些问题中的任何一个问题的答案通常是编译器特定的

我预计内联的可能性肯定会受到影响,因为定义不在标题中.

很好的是,它设法分离模板化函数的声明和定义(对于特定的模板参数),而不是像使用.h文件底部包含的.inl那样.这也隐藏了库的用户的实现,这是有益的(但不是严格必要的),同时仍然能够使用模板,因此我不必实现N次函数.

有没有办法通过调整方法来允许内联？

我发现很难只是谷歌搜索这些问题的答案,标准规范很难理解这些主题(至少对我来说).

顺便说一句,预计这将与VS2010,VS2012和GCC 4.7一起编译.

任何援助将不胜感激.

谢谢

Answer 1

wil*_*llj 5

我假设您的技术旨在与此问题的答案相同：模板实例化对编译持续时间的影响

为了达到预期的结果，您还需要通过使用extern. 请参阅使用 extern 的显式实例化声明

//in .h
template<typename T>
Vector3<T> foo(const Vector4<T>& a, 
               const Quaternion<T>& b);

extern template Vector3<float> foo<float>(const Vector4<float>& a, 
                                          const Quaternion<float>& b);

extern template Vector3<double> foo<double>(const Vector4<double>& a, 
                                            const Quaternion<double>& b);

//in .cpp
template<typename T>
Vector3<T> foo(const Vector4<T>& a, 
               const Quaternion<T>& b) 
{ /* do something...*/ }

template Vector3<float> foo<float>(const Vector4<float>& a, 
                                   const Quaternion<float>& b);
template Vector3<double> foo<double>(const Vector4<double>& a, 
                                     const Quaternion<double>& b);

Run Code Online (Sandbox Code Playgroud)

这应该有助于编译时间？这会影响内联函数的可能性吗？这些问题中的任何一个的答案通常是特定于编译器的吗？

答案高度依赖于编译器 - 并且应该根据经验更准确地确定 - 但我们可以对其进行概括。

我们可以假设编译时间的增加不是来自解析额外模板尖括号语法的成本，而是来自模板实例化的（复杂）过程的成本。如果是这种情况，仅当实例化成本高且编译器多次执行实例化时，在多个翻译单元中使用给定模板特化的成本应该会显着增加编译时间。

C++ 标准隐式地允许编译器在所有翻译单元中仅执行一次每个唯一模板特化的实例化。也就是说，模板函数的实例化可以推迟并在初始编译后执行，如Comeau文档中所述。这种优化是否实现取决于编译器，但肯定不会在 2015 年之前的任何版本的 MSVC 中实现。

如果您的编译器在链接时执行实例化，并且编译器不支持跨模块内联，则此技术将阻止内联。较新版本的 MSVC、GCC 和 Clang 都支持在链接时使用附加链接器选项（LTCG或LTO）进行跨模块内联。请参阅链接器可以内联函数吗？

归档时间：	10 年，12 月前
查看次数：	1002 次
最近记录：	10 年，11 月前