为什么c中的欧氏距离函数比java中的慢?

swd*_*don 1 c java performance benchmarking

我使用以下代码在 c 和 java 中实现并基准测试了以下函数。对于 c,我得到大约 1.688852 秒,而对于 java,只需要 0.355038 秒。即使我删除该sqrt函数,手动内联代码或更改函数签名以接受 6 个double坐标(以避免通过指针访问),c 时间流逝也不会改善太多。

\n

我正在编译 c 程序,例如cc -O2 main.c -lm. 对于java,我使用默认的jvm选项(java 8,openjdk)在intellij idea中运行应用程序。

\n

C:

\n
#include <math.h>\n#include <stdio.h>\n#include <time.h>\n\ntypedef struct point3d\n{\n  double x;\n  double y;\n  double z;\n} point3d_t;\n\ndouble distance(point3d_t *from, point3d_t *to);\n\nint main(int argc, char const *argv[])\n{\n  point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};\n  point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};\n\n  double time = 0.0;\n  int count = 10000000;\n\n  for (size_t i = 0; i < count; i++)\n  {\n    clock_t tic = clock();\n    double d = distance(&from, &to);\n    clock_t toc = clock();\n    time += ((double) (toc - tic) / CLOCKS_PER_SEC);\n  }\n\n  printf("Elapsed: %f seconds\\n", time);\n\n  return 0;\n}\n\ndouble distance(point3d_t *from, point3d_t *to)\n{\n  double dx = to->x - from->x;\n  double dy = to->y - from->y;\n  double dz = to->z - from->z;\n\n  double d2 = (dx * dx) + (dy * dy) + (dz + dz);\n  return sqrt(d2);\n}\n
Run Code Online (Sandbox Code Playgroud)\n

爪哇:

\n
public class App \n{\n    static Random rnd = new Random();\n\n    public static void main( String[] args )\n    {\n        var sw = new StopWatch();\n        var time = 0.0;\n        var count = 10000000;\n\n        for (int i = 0; i < count; i++) {\n            var from = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());\n            var to = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());\n\n            sw.start();\n            var dist = distance(from, to);\n            sw.stop();\n            time += sw.getTime(TimeUnit.NANOSECONDS);\n            sw.reset();\n        }\n\n        System.out.printf("Time: %f seconds\\n", time / 1e09);\n    }\n\n    public static double distance(Vector3D from, Vector3D to) {\n        var dx = to.getX() - from.getX();\n        var dy = to.getY() - from.getY();\n        var dz = to.getZ() - from.getZ();\n\n        return Math.sqrt((dx * dx) + (dy * dy) + (dz * dz));\n    }\n}\n\n
Run Code Online (Sandbox Code Playgroud)\n

我的目标是理解为什么 c 程序比 java 程序慢并使其运行得更快。

\n

编辑:我在java程序中使用随机值来尝试确保jvm不会做任何有趣的事情,比如缓存结果和完全回避计算。

\n

编辑:更新 c 的两个片段以使用clock_gettime(),以记录所有循环而不是方法调用所花费的时间,并且也不丢弃方法调用的结果:

\n
#include <math.h>\n#include <stdio.h>\n#include <time.h>\n\ntypedef struct point3d\n{\n  double x;\n  double y;\n  double z;\n} point3d_t;\n\ndouble distance(point3d_t *from, point3d_t *to);\n\nint main(int argc, char const *argv[])\n{\n  point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};\n  point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};\n\n  struct timespec fs;\n  struct timespec ts;\n\n  long time = 0;\n  int count = 10000000;\n  double dist = 0;\n\n  clock_gettime(CLOCK_REALTIME, &fs);\n\n  for (size_t i = 0; i < count; i++)\n  {\n    dist = distance(&from, &to);\n  }\n\n  clock_gettime(CLOCK_REALTIME, &ts);\n  time = ts.tv_nsec - fs.tv_nsec;\n\n  if (dist == 0.001)\n  {\n    printf("hello\\n");\n  }\n\n  printf("Elapsed: %f sec\\n", (double) time / 1e9);\n\n  return 0;\n}\n\ndouble distance(point3d_t *from, point3d_t *to)\n{\n  double dx = to->x - from->x;\n  double dy = to->y - from->y;\n  double dz = to->z - from->z;\n\n  double d2 = (dx * dx) + (dy * dy) + (dz + dz);\n  return sqrt(d2);\n}\n
Run Code Online (Sandbox Code Playgroud)\n

爪哇:

\n
public class App \n{\n    static Random rnd = new Random();\n\n    public static void main( String[] args )\n    {\n        var from = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());\n        var to = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());\n\n        var time = 0.0;\n        var count = 10000000;\n        double dist = 0.0;\n\n        var start = System.nanoTime();\n        for (int i = 0; i < count; i++) {\n            dist = distance(from, to);\n        }\n\n        var end = System.nanoTime();\n        time = end - start;\n\n        if (dist == rnd.nextDouble()) {\n            System.out.printf("hello! %f\\n", dist);\n        }\n\n        dist = dist + 1;\n        System.out.printf("Time: %f sec\\n", (double) time / 1e9);\n        System.out.printf("Yohoo! %f\\n", dist);\n    }\n\n    public static double distance(Vector3D from, Vector3D to) {\n        var dx= to.getX() - from.getX();\n        var dy = to.getY() - from.getY();\n        var dz = to.getZ() - from.getZ();\n\n        return Math.sqrt((dx * dx) + (dy * dy) + (dz * dz));\n    }\n}\n
Run Code Online (Sandbox Code Playgroud)\n

使用 .c 编译gcc -Wall -std=gnu99 -O2 main.c -lm代码 现在,c 代码的结果为 0.06323 秒,java 的结果为 0.006325 秒。

\n

编辑:正如 J\xc3\xa9r\xc3\xb4me Richard 和 Peter Cordes 指出的那样,我的基准测试是有缺陷的,更不用说我在 c 版本中取负数的 sqrt 了。所以,当我用 编译c 程序时-fno-math-errno,它的计时时间为0 秒。我编译了c程序,如gcc -O2 -std=gnu99 main.c -lm. 现在,c 程序实际上计时为零秒 (271 ns),而 java 计时为 0.007217 秒。一切都井井有条:)

\n

下面是最终代码:

\n

C:

\n
#include <math.h>\n#include <stdio.h>\n#include <time.h>\n\ntypedef struct point3d\n{\n  double x;\n  double y;\n  double z;\n} point3d_t;\n\ndouble distance(point3d_t *from, point3d_t *to);\n\nint main(int argc, char const *argv[])\n{\n  point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};\n  point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};\n\n  struct timespec fs;\n  struct timespec ts;\n\n  long time = 0;\n  int count = 10000000;\n  double dist = 0;\n\n  clock_gettime(CLOCK_REALTIME, &fs);\n\n  for (size_t i = 0; i < count; i++)\n  {\n    dist = distance(&from, &to);\n  }\n\n  clock_gettime(CLOCK_REALTIME, &ts);\n  time = ts.tv_nsec - fs.tv_nsec;\n\n  printf("hello %f \\n", dist);\n  printf("Elapsed: %f ns\\n", (double) time);\n  printf("Elapsed: %f sec\\n", (double) time / 1e9);\n\n  return 0;\n}\n\ndouble distance(point3d_t *from, point3d_t *to)\n{\n  double dx = (to->x) - (from->x);\n  double dy = (to->y) - (from->y);\n  double dz = (to->z) - (from->z);\n\n  double d2 = (dx * dx) + (dy * dy) + (dz * dz);\n  return sqrt(d2);\n}\n
Run Code Online (Sandbox Code Playgroud)\n

爪哇:

\n
public class App \n{\n    static Random rnd = new Random();\n\n    public static void main( String[] args )\n    {\n        var from = Vector3D.of(2.3, 3.45, 4.56);\n        var to = Vector3D.of(5.678, 3.45, -9.0781);\n\n        var time = 0.0;\n        var count = 10000000;\n        double dist = 0.0;\n\n        var start = System.nanoTime();\n        for (int i = 0; i < count; i++) {\n            dist = distance(from, to);\n        }\n\n        var end = System.nanoTime();\n        time = end - start;\n\n        System.out.printf("Yohoo! %f\\n", dist);\n        System.out.printf("Time: %f ns\\n", (double) time / 1e9);\n    }\n\n    public static double distance(Vector3D from, Vector3D to) {\n        var dx = to.getX() - from.getX();\n        var dy = to.getY() - from.getY();\n        var dz = to.getZ() - from.getZ();\n\n        var d2 =  (dx * dx) + (dy * dy) + (dz * dz);\n        return Math.sqrt(d2);\n    }\n}\n
Run Code Online (Sandbox Code Playgroud)\n

Jér*_*ard 5

首先,用于测量时序的方法非常不精确。当前的方法引入了巨大的偏差,该偏差可能比测量本身还要大。事实上,clock在许多平台上并不是很精确(在我的机器上大约为 1 毫秒,并且在几乎所有平台上通常不超过 1 微秒)。此外,10,000,000 次迭代会严重放大不精确性。如果要精确测量循环,则需要将时钟调用移到循环之外(如果可能,请使用更精确的测量函数)。

不过,主要问题是函数结果未被使用,而 Java JIT 可以看到这一点并对其进行部分优化。GCC 不能,因为数学函数的标准行为(errno导致 Java 代码中不可用的副作用)。您可以使用命令行标志禁用 errno-fno-math-errno。有了这个,GCC 现在可以完全优化函数(即删除函数调用),并且产生的时间要少得多。然而,该基准是有缺陷的,您可能不想测量它。如果你想编写一个正确的基准,你需要读取计算值。例如,您可以计算校验和,至少检查结果是否正确/等效

  • 有一个规范的问答([性能评估的惯用方式?](/sf/ask/4220439121/))涵盖了您正在谈论的许多有缺陷的方法,这些方法并非特定于所测试的功能(数学错误)。 (3认同)