当我将一些fortran代码移植到c时,令我感到惊讶的是,使用ifort(intel fortran编译器)编译的fortran程序与使用gcc编译的c程序之间的大部分执行时间差异来自于对三角函数的评估(sin,cos).这令我感到惊讶,因为我曾经相信这个答案解释的是,正弦和余弦等函数是在微处理器内部的微码中实现的.
为了更明确地发现问题,我在fortran做了一个小测试程序
program ftest
implicit none
real(8) :: x
integer :: i
x = 0d0
do i = 1, 10000000
x = cos (2d0 * x)
end do
write (*,*) x
end program ftest
Run Code Online (Sandbox Code Playgroud)
在intel Q6600处理器上,3.6.9-1-ARCH x86_64 Linux
我得到了ifort version 12.1.0
$ ifort -o ftest ftest.f90
$ time ./ftest
-0.211417093282753
real 0m0.280s
user 0m0.273s
sys 0m0.003s
Run Code Online (Sandbox Code Playgroud)
而与gcc version 4.7.2我得到
$ gfortran -o ftest ftest.f90
$ time ./ftest …Run Code Online (Sandbox Code Playgroud) 我将使用以下示例程序演示此问题
{-# LANGUAGE BangPatterns #-}
data Point = Point !Double !Double
fmod :: Double -> Double -> Double
fmod a b | a < 0 = b - fmod (abs a) b
| otherwise = if a < b then a
else let q = a / b
in b * (q - fromIntegral (floor q :: Int))
standardMap :: Double -> Point -> Point
standardMap k (Point q p) =
Point (fmod (q + p) (2 * pi)) (fmod …Run Code Online (Sandbox Code Playgroud) 我的问题的描述几乎和这篇文章一样,但是虽然我认为我可以理解相应的解决方案,但我看不出它如何适用于我的问题,如果有的话.
这是我的示例程序
{-# LANGUAGE BangPatterns #-}
import System.Random (randoms, mkStdGen)
import Control.Parallel.Strategies
import Control.DeepSeq (NFData)
import Data.List
data Point = Point !Double !Double
fmod :: Double -> Double -> Double
fmod a b | a < 0 = b - fmod (abs a) b
| otherwise = if a < b then a
else let q = a / b
in b * (q - fromIntegral (floor q :: Int))
standardMap :: Double -> Point -> …Run Code Online (Sandbox Code Playgroud) I am dealing with the computation which has as an intermediate result a list A=[B], which is a list of K lists of the length L. The time-complexity to compute an element of B is controlled by the parameter M and is theoretically linear in M. Theoretically I would expect the time-complexity for the computation of A to be O(K*L*M). However, this is not the case and I don't understand why?
Here is the simple complete sketch program which exhibits …