小编mja*_*bse的帖子

如何处理似乎取决于机器代码位置的分支错误预测？

在尝试以 CSC 格式对简单稀疏单元下三角后向求解的实现进行基准测试时，我观察到奇怪的行为。性能似乎有很大差异，具体取决于汇编指令在可执行文件中的最终位置。我在同一问题的许多不同变体中观察到这一点。一个最小的例子是获取重复的实施指令

void lowerUnitTriangularTransposedBacksolve(const EntryIndex* col_begin_indices,
                                            const Index* row_indices,
                                            const Value* values,
                                            const Index dimension, Value* x) {
  if (dimension == 0) return;

  EntryIndex entry_index = col_begin_indices[dimension];
  Index col_index = dimension;
  do {
    col_index -= 1;
    const EntryIndex col_begin = col_begin_indices[col_index];

    if (entry_index > col_begin) {
      Value x_temp = x[col_index];
      do {
        entry_index -= 1;
        x_temp -= values[entry_index] * x[row_indices[entry_index]];
      } while (entry_index != col_begin);
      x[col_index] = x_temp;
    }
  } while (col_index != 0);
}

Run Code Online (Sandbox Code Playgroud)

在两个函数中benchmarkBacksolve1和benchmarkBacksolve2 …

c++ performance benchmarking cpu-architecture branch-prediction

mja*_*bse

2022 08-21

7
推荐指数

1
解决办法

467
查看次数

是否允许为另一个过程的可选参数传递一个不存在的假定形状数组？

在该小例子，是它允许通过可选的伪参数y的test_wrapper可能不present作为用于相应的可选的伪参数实际参数y的test？

program main
    implicit none
    real :: x = 5.0
    call test_wrapper(x)

contains
    subroutine test_wrapper(x, y)
        implicit none
        real, intent(in) :: x
        real, dimension(:), intent(out), optional :: y
        call test(x, y)
    end subroutine test_wrapper

    subroutine test(x, y)
        implicit none
        real, intent(in) :: x
        real, dimension(:), intent(out), optional :: y
        if (present(y)) then
            y = x
        end if
    end subroutine test
end program

Run Code Online (Sandbox Code Playgroud)

UndefinedBehaviourSanitizer 引发错误，表明它不是：https : //godbolt.org/z/nKj1h6G9r

在这个 Fortran …

standards fortran optional-parameters sanitizer undefined-behavior

mja*_*bse

lucky-day

4
推荐指数

1
解决办法

144
查看次数

比较在Fortran中将浮点添加到临时存储结果

我注意到==-operator对于浮点类型的某些行为对我来说似乎很奇怪。我知道，我不能指望像0.1 + 0.2 == 0.3要.true.因浮点表示的局限性，以及因此，浮点比较通常应该喜欢的东西做abs(x - y) < tolerance。但是，我仍然希望T在任何情况下都可以输出此最小程序：

program main
    integer, parameter :: dp = kind(0d0)
    real(kind=dp) :: a, b, c

    a = 4.4090680619790817d+002
    b = 1.0000000000000000d-004
    c = (a + b)

    print *, (c == (a + b))
end program

Run Code Online (Sandbox Code Playgroud)

在64位Manjaro Linux上使用gfortran 7.3.1编译该程序时，

gfortran -o a.out minimal_example.F90 && a.out

Run Code Online (Sandbox Code Playgroud)

我实际上确实得到了输出T。但是，使用以下命令编译和执行32位可执行文件时

gfortran -m32 -o a.out minimal_example.F90 && a.out

Run Code Online (Sandbox Code Playgroud)

结果是F。在我看来，存储加法结果似乎会稍微改变其值，因为两者之间的差值abs(c - (a + b))大致是 …

floating-point precision fortran gfortran

mja*_*bse

2018 04-08

3
推荐指数

1
解决办法

149
查看次数

标签统计

fortran ×2

benchmarking ×1

branch-prediction ×1

c++ ×1

cpu-architecture ×1

floating-point ×1

gfortran ×1

optional-parameters ×1

performance ×1

precision ×1

sanitizer ×1

standards ×1

undefined-behavior ×1

如何处理似乎取决于机器代码位置的分支错误预测？

是否允许为另一个过程的可选参数传递一个不存在的假定形状数组？

比较在Fortran中将浮点添加到临时存储结果

标签 统计

小编mja_bse的帖子

标签统计