Ale*_*lex 495 c++ algorithm floating-point optimization
比较两个double或两个float值的最有效方法是什么?
简单地这样做是不正确的:
bool CompareDoubles1 (double A, double B)
{
return A == B;
}
Run Code Online (Sandbox Code Playgroud)
但是像这样:
bool CompareDoubles2 (double A, double B)
{
diff = A - B;
return (diff < EPSILON) && (-diff < EPSILON);
}
Run Code Online (Sandbox Code Playgroud)
似乎浪费处理.
有谁知道更聪明的浮动比较器?
And*_*ein 437
使用任何其他建议时要格外小心.这一切都取决于背景.
我花了很长时间跟踪系统中的错误,假设a==b是|a-b|<epsilon.潜在的问题是:
算法中隐含的假设,if a==b和b==cthen a==c.
使用相同的epsilon测量以英寸为单位的线和以mils(.001英寸)测量的线.那是a==b但是1000a!=1000b.(这就是AlmostEqual2sComplement要求epsilon或max ULPS的原因).
对于角度的余弦和线的长度,使用相同的epsilon!
使用这样的比较函数对集合中的项目进行排序.(在这种情况下,使用内置C++运算符== for double产生了正确的结果.)
就像我说:这一切都取决于背景和预期的大小a和b.
BTW,std::numeric_limits<double>::epsilon()是"机器epsilon".它是1.0和下一个值之间的差值,可用双精度表示.我猜它可以在比较函数中使用,但只有在预期值小于1时才会使用.(这是对@ cdv答案的回应......)
另外,如果你基本上有int算术doubles(这里我们使用双精度来保存某些情况下的int值)你的算术是正确的.例如,4.0/2.0将与1.0 + 1.0相同.只要您不执行导致分数(4.0/3.0)或不超出int大小的事情.
OJ.*_*OJ. 175
与epsilon值的比较是大多数人所做的(即使在游戏编程中).
你应该稍微改变你的实现:
bool AreSame(double a, double b)
{
return fabs(a - b) < EPSILON;
}
Run Code Online (Sandbox Code Playgroud)
编辑:Christer在最近的博客文章中添加了一堆有关此主题的精彩信息.请享用.
skr*_*bel 113
我发现Google C++测试框架包含一个很好的跨平台基于模板的AlmostEqual2sComplement实现,它可以在双精度和浮点数上运行.鉴于它是根据BSD许可证发布的,只要您保留许可证,在您自己的代码中使用它应该没有问题.我从http://code.google.com/p/googletest/source/browse/trunk/include/gtest/internal/gtest-internal.h https://github.com/google/googletest/blob中提取了以下代码/master/googletest/include/gtest/internal/gtest-internal.h并在顶部添加了许可证.
一定要#define GTEST_OS_WINDOWS到某个值(或者将代码改为适合代码库的代码 - 毕竟它是BSD许可的).
用法示例:
double left = // something
double right = // something
const FloatingPoint<double> lhs(left), rhs(right);
if (lhs.AlmostEquals(rhs)) {
//they're equal!
}
Run Code Online (Sandbox Code Playgroud)
这是代码:
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: wan@google.com (Zhanyong Wan), eefacm@gmail.com (Sean Mcafee)
//
// The Google C++ Testing Framework (Google Test)
// This template class serves as a compile-time function from size to
// type. It maps a size in bytes to a primitive type with that
// size. e.g.
//
// TypeWithSize<4>::UInt
//
// is typedef-ed to be unsigned int (unsigned integer made up of 4
// bytes).
//
// Such functionality should belong to STL, but I cannot find it
// there.
//
// Google Test uses this class in the implementation of floating-point
// comparison.
//
// For now it only handles UInt (unsigned int) as that's all Google Test
// needs. Other types can be easily added in the future if need
// arises.
template <size_t size>
class TypeWithSize {
public:
// This prevents the user from using TypeWithSize<N> with incorrect
// values of N.
typedef void UInt;
};
// The specialization for size 4.
template <>
class TypeWithSize<4> {
public:
// unsigned int has size 4 in both gcc and MSVC.
//
// As base/basictypes.h doesn't compile on Windows, we cannot use
// uint32, uint64, and etc here.
typedef int Int;
typedef unsigned int UInt;
};
// The specialization for size 8.
template <>
class TypeWithSize<8> {
public:
#if GTEST_OS_WINDOWS
typedef __int64 Int;
typedef unsigned __int64 UInt;
#else
typedef long long Int; // NOLINT
typedef unsigned long long UInt; // NOLINT
#endif // GTEST_OS_WINDOWS
};
// This template class represents an IEEE floating-point number
// (either single-precision or double-precision, depending on the
// template parameters).
//
// The purpose of this class is to do more sophisticated number
// comparison. (Due to round-off error, etc, it's very unlikely that
// two floating-points will be equal exactly. Hence a naive
// comparison by the == operation often doesn't work.)
//
// Format of IEEE floating-point:
//
// The most-significant bit being the leftmost, an IEEE
// floating-point looks like
//
// sign_bit exponent_bits fraction_bits
//
// Here, sign_bit is a single bit that designates the sign of the
// number.
//
// For float, there are 8 exponent bits and 23 fraction bits.
//
// For double, there are 11 exponent bits and 52 fraction bits.
//
// More details can be found at
// http://en.wikipedia.org/wiki/IEEE_floating-point_standard.
//
// Template parameter:
//
// RawType: the raw floating-point type (either float or double)
template <typename RawType>
class FloatingPoint {
public:
// Defines the unsigned integer type that has the same size as the
// floating point number.
typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits;
// Constants.
// # of bits in a number.
static const size_t kBitCount = 8*sizeof(RawType);
// # of fraction bits in a number.
static const size_t kFractionBitCount =
std::numeric_limits<RawType>::digits - 1;
// # of exponent bits in a number.
static const size_t kExponentBitCount = kBitCount - 1 - kFractionBitCount;
// The mask for the sign bit.
static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1);
// The mask for the fraction bits.
static const Bits kFractionBitMask =
~static_cast<Bits>(0) >> (kExponentBitCount + 1);
// The mask for the exponent bits.
static const Bits kExponentBitMask = ~(kSignBitMask | kFractionBitMask);
// How many ULP's (Units in the Last Place) we want to tolerate when
// comparing two numbers. The larger the value, the more error we
// allow. A 0 value means that two numbers must be exactly the same
// to be considered equal.
//
// The maximum error of a single floating-point operation is 0.5
// units in the last place. On Intel CPU's, all floating-point
// calculations are done with 80-bit precision, while double has 64
// bits. Therefore, 4 should be enough for ordinary use.
//
// See the following article for more details on ULP:
// http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm.
static const size_t kMaxUlps = 4;
// Constructs a FloatingPoint from a raw floating-point number.
//
// On an Intel CPU, passing a non-normalized NAN (Not a Number)
// around may change its bits, although the new value is guaranteed
// to be also a NAN. Therefore, don't expect this constructor to
// preserve the bits in x when x is a NAN.
explicit FloatingPoint(const RawType& x) { u_.value_ = x; }
// Static methods
// Reinterprets a bit pattern as a floating-point number.
//
// This function is needed to test the AlmostEquals() method.
static RawType ReinterpretBits(const Bits bits) {
FloatingPoint fp(0);
fp.u_.bits_ = bits;
return fp.u_.value_;
}
// Returns the floating-point number that represent positive infinity.
static RawType Infinity() {
return ReinterpretBits(kExponentBitMask);
}
// Non-static methods
// Returns the bits that represents this number.
const Bits &bits() const { return u_.bits_; }
// Returns the exponent bits of this number.
Bits exponent_bits() const { return kExponentBitMask & u_.bits_; }
// Returns the fraction bits of this number.
Bits fraction_bits() const { return kFractionBitMask & u_.bits_; }
// Returns the sign bit of this number.
Bits sign_bit() const { return kSignBitMask & u_.bits_; }
// Returns true iff this is NAN (not a number).
bool is_nan() const {
// It's a NAN if the exponent bits are all ones and the fraction
// bits are not entirely zeros.
return (exponent_bits() == kExponentBitMask) && (fraction_bits() != 0);
}
// Returns true iff this number is at most kMaxUlps ULP's away from
// rhs. In particular, this function:
//
// - returns false if either number is (or both are) NAN.
// - treats really large numbers as almost equal to infinity.
// - thinks +0.0 and -0.0 are 0 DLP's apart.
bool AlmostEquals(const FloatingPoint& rhs) const {
// The IEEE standard says that any comparison operation involving
// a NAN must return false.
if (is_nan() || rhs.is_nan()) return false;
return DistanceBetweenSignAndMagnitudeNumbers(u_.bits_, rhs.u_.bits_)
<= kMaxUlps;
}
private:
// The data type used to store the actual floating-point number.
union FloatingPointUnion {
RawType value_; // The raw floating-point number.
Bits bits_; // The bits that represent the number.
};
// Converts an integer from the sign-and-magnitude representation to
// the biased representation. More precisely, let N be 2 to the
// power of (kBitCount - 1), an integer x is represented by the
// unsigned number x + N.
//
// For instance,
//
// -N + 1 (the most negative number representable using
// sign-and-magnitude) is represented by 1;
// 0 is represented by N; and
// N - 1 (the biggest number representable using
// sign-and-magnitude) is represented by 2N - 1.
//
// Read http://en.wikipedia.org/wiki/Signed_number_representations
// for more details on signed number representations.
static Bits SignAndMagnitudeToBiased(const Bits &sam) {
if (kSignBitMask & sam) {
// sam represents a negative number.
return ~sam + 1;
} else {
// sam represents a positive number.
return kSignBitMask | sam;
}
}
// Given two numbers in the sign-and-magnitude representation,
// returns the distance between them as an unsigned number.
static Bits DistanceBetweenSignAndMagnitudeNumbers(const Bits &sam1,
const Bits &sam2) {
const Bits biased1 = SignAndMagnitudeToBiased(sam1);
const Bits biased2 = SignAndMagnitudeToBiased(sam2);
return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1);
}
FloatingPointUnion u_;
};
Run Code Online (Sandbox Code Playgroud)
编辑:这篇文章是4岁.它可能仍然有效,代码很好,但有些人发现了改进.最好AlmostEquals从Google Test源代码中获取最新版本,而不是我在此处粘贴的版本.
mch*_*mch 103
比较浮点数取决于上下文.因为即使改变操作顺序也会产生不同的结果,重要的是要知道你想要数字的"相等"程度.
在查看浮点比较时,比较布鲁斯道森的浮点数是一个很好的起点.
以下定义来自Knuth的计算机编程艺术:
bool approximatelyEqual(float a, float b, float epsilon)
{
return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
bool essentiallyEqual(float a, float b, float epsilon)
{
return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
bool definitelyGreaterThan(float a, float b, float epsilon)
{
return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
bool definitelyLessThan(float a, float b, float epsilon)
{
return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
Run Code Online (Sandbox Code Playgroud)
当然,选择epsilon取决于上下文,并确定您希望数字的相等程度.
比较浮点数的另一种方法是查看数字的ULP(最后位置的单位).虽然没有专门处理比较,但是每个计算机科学家应该知道的浮点数是一个很好的资源,可以用来理解浮点的工作原理和陷阱是什么,包括ULP是什么.
gro*_*rom 47
有关更深入的方法,请参阅比较浮点数.以下是该链接的代码段:
// Usable AlmostEqual function
bool AlmostEqual2sComplement(float A, float B, int maxUlps)
{
// Make sure maxUlps is non-negative and small enough that the
// default NAN won't compare as equal to anything.
assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024);
int aInt = *(int*)&A;
// Make aInt lexicographically ordered as a twos-complement int
if (aInt < 0)
aInt = 0x80000000 - aInt;
// Make bInt lexicographically ordered as a twos-complement int
int bInt = *(int*)&B;
if (bInt < 0)
bInt = 0x80000000 - bInt;
int intDiff = abs(aInt - bInt);
if (intDiff <= maxUlps)
return true;
return false;
}
Run Code Online (Sandbox Code Playgroud)
Chr*_*ies 25
在C++中获取epsilon的便携方式是
#include <limits>
std::numeric_limits<double>::epsilon()
Run Code Online (Sandbox Code Playgroud)
然后比较功能变为
#include <cmath>
#include <limits>
bool AreSame(double a, double b) {
return std::fabs(a - b) < std::numeric_limits<double>::epsilon();
}
Run Code Online (Sandbox Code Playgroud)
Sha*_*our 25
认识到这是一个旧线程,但本文是我在比较浮点数时发现的最直接的文章之一,如果你想探索更多,它还有更详细的参考资料,主要网站涵盖了一系列问题处理浮点数浮点指南:比较.
我们可以在浮点公差中找到一篇更实用的文章,并注意到有绝对容差测试,可以归结为C++:
bool absoluteToleranceCompare(double x, double y)
{
return std::fabs(x - y) <= std::numeric_limits<double>::epsilon() ;
}
Run Code Online (Sandbox Code Playgroud)
和相对耐受性测试:
bool relativeToleranceCompare(double x, double y)
{
double maxXY = std::max( std::fabs(x) , std::fabs(y) ) ;
return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXY ;
}
Run Code Online (Sandbox Code Playgroud)
文章指出,绝对的测试失败时x和y又大又处于相对的情况下失败的时候都小.假设他的绝对和相对容差是相同的,组合测试将如下所示:
bool combinedToleranceCompare(double x, double y)
{
double maxXYOne = std::max( { 1.0, std::fabs(x) , std::fabs(y) } ) ;
return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXYOne ;
}
Run Code Online (Sandbox Code Playgroud)
ful*_*ton 14
你写的代码被窃听:
return (diff < EPSILON) && (-diff > EPSILON);
Run Code Online (Sandbox Code Playgroud)
正确的代码是:
return (diff < EPSILON) && (diff > -EPSILON);
Run Code Online (Sandbox Code Playgroud)
(...是的,这是不同的)
我想知道在某些情况下,晶圆厂不会让你失去懒惰的评价.我会说这取决于编译器.你可能想尝试两者.如果它们的平均值相等,那就采用fabs实现.
如果你有一些关于两个浮点数哪个更可能比其他浮点数更大的信息,你可以按照比较的顺序进行游戏以更好地利用延迟评估.
最后,通过内联此函数可能会获得更好的结果.虽然不太可能改善...
编辑:OJ,感谢您更正代码.我相应地删除了我的评论
小智 13
`return fabs(a - b)<EPSILON;
这样可以:
但否则它会让你陷入困境.双精度数字的分辨率约为16位小数.如果您要比较的两个数字的幅度大于EPSILON*1.0E16,那么您可能会说:
return a==b;
Run Code Online (Sandbox Code Playgroud)
我将研究一种不同的方法,假设您需要担心第一个问题,并假设第二个问题适合您的应用程序.一个解决方案是这样的:
#define VERYSMALL (1.0E-150)
#define EPSILON (1.0E-8)
bool AreSame(double a, double b)
{
double absDiff = fabs(a - b);
if (absDiff < VERYSMALL)
{
return true;
}
double maxAbs = max(fabs(a) - fabs(b));
return (absDiff/maxAbs) < EPSILON;
}
Run Code Online (Sandbox Code Playgroud)
这在计算上是昂贵的,但它有时是所要求的.这是我们在公司必须做的事情,因为我们处理工程库,输入可能会有几十个数量级.
无论如何,关键在于(并且几乎适用于所有编程问题):评估您的需求,然后提出解决方案来满足您的需求 - 不要认为简单的答案将满足您的需求.如果在你的评估之后你发现fabs(a-b) < EPSILON它就足够了,完美 - 使用它!但要注意它的缺点和其他可能的解决方案.
Shi*_*hah 10
我花了很长时间在这个伟大的线程中浏览材料.我怀疑每个人都想花这么多时间,所以我要强调我学到的内容和我实施的解决方案的总结.
快速摘要
numeric_limits::epsilon()与float.h中的FLT_EPSILON相同.然而,这是有问题的,因为epsilon用于比较1.0之类的值,如果与epsilon不相同,则用于像1E9这样的值.FLT_EPSILON定义为1.0.fabs(a-b) <= epsilon不起作用的,因为默认epsilon定义为1.0.我们需要根据a和b来向上或向下扩展epsilon.max(a,b)要么可以获得围绕a的下一个可表示数字,然后查看b是否属于该范围.前者称为"相对"方法,后来称为ULP方法.实用功能实现(C++ 11)
//implements relative method - do not use for comparing with zero
//use this most of the time, tolerance needs to be meaningful in your context
template<typename TReal>
static bool isApproximatelyEqual(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = std::fabs(a - b);
if (diff <= tolerance)
return true;
if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
//supply tolerance that is meaningful in your context
//for example, default tolerance may not work if you are comparing double with float
template<typename TReal>
static bool isApproximatelyZero(TReal a, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
if (std::fabs(a) <= tolerance)
return true;
return false;
}
//use this when you want to be on safe side
//for example, don't start rover unless signal is above 1
template<typename TReal>
static bool isDefinitelyLessThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = a - b;
if (diff < tolerance)
return true;
if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
template<typename TReal>
static bool isDefinitelyGreaterThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = a - b;
if (diff > tolerance)
return true;
if (diff > std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
//implements ULP method
//use this when you are only concerned about floating point precision issue
//for example, if you want to see if a is 1.0 by checking if its within
//10 closest representable floating point numbers around 1.0.
template<typename TReal>
static bool isWithinPrecisionInterval(TReal a, TReal b, unsigned int interval_size = 1)
{
TReal min_a = a - (a - std::nextafter(a, std::numeric_limits<TReal>::lowest())) * interval_size;
TReal max_a = a + (std::nextafter(a, std::numeric_limits<TReal>::max()) - a) * interval_size;
return min_a <= b && max_a >= b;
}
Run Code Online (Sandbox Code Playgroud)
正如其他人所指出的那样,使用固定指数epsilon(例如0.0000001)对于远离epsilon值的值将是无用的.例如,如果您的两个值是10000.000977和10000,则这两个数字之间没有 32位浮点值 - 10000和10000.000977尽可能接近,而不是逐位相同.这里,小于0.0009的ε是没有意义的; 你也可以使用直线相等运算符.
同样,当两个值的大小接近epsilon时,相对误差增加到100%.
因此,尝试将诸如0.00001的固定点数与浮点值(指数是任意的)混合是一种毫无意义的练习.只有在可以确保操作数值位于窄域(即接近某个特定指数)的情况下,以及为该特定测试正确选择epsilon值时,这才会起作用.如果你从空中拉出一个数字("嘿!0.00001很小,那一定是好的!"),你注定会出现数字错误.我花了很多时间来调试糟糕的数字代码,其中一些可怜的schmuck在随机epsilon值中抛出,以使另一个测试用例工作.
如果你进行任何类型的数值编程并且认为你需要达到定点epsilons,请阅读BRUCE关于比较浮点数的文章.
这里证明 usingstd::numeric_limits::epsilon()不是答案 \xe2\x80\x94 它对于大于 1 的值会失败:
我上面的评论的证明:
\n\n#include <stdio.h>\n#include <limits>\n\ndouble ItoD (__int64 x) {\n // Return double from 64-bit hexadecimal representation.\n return *(reinterpret_cast<double*>(&x));\n}\n\nvoid test (__int64 ai, __int64 bi) {\n double a = ItoD(ai), b = ItoD(bi);\n bool close = std::fabs(a-b) < std::numeric_limits<double>::epsilon();\n printf ("%.16f and %.16f %s close.\\n", a, b, close ? "are " : "are not");\n}\n\nint main()\n{\n test (0x3fe0000000000000L,\n 0x3fe0000000000001L);\n\n test (0x3ff0000000000000L,\n 0x3ff0000000000001L);\n}\nRun Code Online (Sandbox Code Playgroud)\n\n运行会产生以下输出:
\n\n0.5000000000000000 and 0.5000000000000001 are close.\n1.0000000000000000 and 1.0000000000000002 are not close.\nRun Code Online (Sandbox Code Playgroud)\n\n请注意,在第二种情况(一且仅大于一)中,两个输入值尽可能接近,但比较时仍不接近。因此,对于大于 1.0 的值,您不妨只使用相等测试。比较浮点值时,固定 epsilon 不会为您带来帮助。
\nQt实现了两个功能,也许您可以从中学习:
static inline bool qFuzzyCompare(double p1, double p2)
{
return (qAbs(p1 - p2) <= 0.000000000001 * qMin(qAbs(p1), qAbs(p2)));
}
static inline bool qFuzzyCompare(float p1, float p2)
{
return (qAbs(p1 - p2) <= 0.00001f * qMin(qAbs(p1), qAbs(p2)));
}
Run Code Online (Sandbox Code Playgroud)
您可能需要以下功能,因为
请注意,比较p1或p2为0.0的值将不起作用,比较其中一个值为NaN或无穷大的值也将无效。如果值之一始终为0.0,请改用qFuzzyIsNull。如果其中一个值很可能是0.0,则一种解决方法是将两个值加1.0。
static inline bool qFuzzyIsNull(double d)
{
return qAbs(d) <= 0.000000000001;
}
static inline bool qFuzzyIsNull(float f)
{
return qAbs(f) <= 0.00001f;
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
392286 次 |
| 最近记录: |