Rob*_*art 12 arrays haskell repa accelerate-haskell
Haskell repa库用于在CPU上自动并行计算数组.加速库是GPU上的自动数据并行.API非常相似,具有相同的N维数组表示.人们甚至可以切换与加速,维修服务阵列fromRepa
和toRepa
在Data.Array.Accelerate.IO
:
fromRepa :: (Shapes sh sh', Elt e) => Array A sh e -> Array sh' e
toRepa :: Shapes sh sh' => Array sh' e -> Array A sh e
Run Code Online (Sandbox Code Playgroud)
有多个后端用于加速,包括LLVM,CUDA和FPGA(参见http://www.cse.unsw.edu.au/~keller/Papers/acc-cuda.pdf的图2 ).虽然图书馆似乎没有得到维护,但我发现了一个加速的后端.鉴于修复和加速编程模型是相似的,我希望在它们之间有一种优雅的切换方式,即一次写入的函数可以用repa的R.computeP执行,或者用加速的后端执行,例如使用CUDA 运行函数.
采用简单的图像处理阈值功能.如果灰度像素值小于50,则将其设置为0,否则保留其值.这是它对南瓜的作用:
以下代码介绍了修复和加速实现:
module Main where
import qualified Data.Array.Repa as R
import qualified Data.Array.Repa.IO.BMP as R
import qualified Data.Array.Accelerate as A
import qualified Data.Array.Accelerate.IO as A
import qualified Data.Array.Accelerate.Interpreter as A
import Data.Word
-- Apply threshold over image using accelerate (interpreter)
thresholdAccelerate :: IO ()
thresholdAccelerate = do
img <- either (error . show) id `fmap` A.readImageFromBMP "pumpkin-in.bmp"
let newImg = A.run $ A.map evalPixel (A.use img)
A.writeImageToBMP "pumpkin-out.bmp" newImg
where
-- *** Exception: Prelude.Ord.compare applied to EDSL types
evalPixel :: A.Exp A.Word32 -> A.Exp A.Word32
evalPixel p = if p > 50 then p else 0
-- Apply threshold over image using repa
thresholdRepa :: IO ()
thresholdRepa = do
let arr :: IO (R.Array R.U R.DIM2 (Word8,Word8,Word8))
arr = either (error . show) id `fmap` R.readImageFromBMP "pumpkin-in.bmp"
img <- arr
newImg <- R.computeP (R.map applyAtPoint img)
R.writeImageToBMP "pumpkin-out.bmp" newImg
where
applyAtPoint :: (Word8,Word8,Word8) -> (Word8,Word8,Word8)
applyAtPoint (r,g,b) =
let [r',g',b'] = map applyThresholdOnPixel [r,g,b]
in (r',g',b')
applyThresholdOnPixel x = if x > 50 then x else 0
data BackendChoice = Repa | Accelerate
main :: IO ()
main = do
let userChoice = Repa -- pretend this command line flag
case userChoice of
Repa -> thresholdRepa
Accelerate -> thresholdAccelerate
Run Code Online (Sandbox Code Playgroud)
实现thresholdAccelerate
和thresholdRepa
非常相似.是否有一种优雅的方法来编写一次阵列处理功能,然后以编程方式选择多路CPU(修复)或GPU(加速)?我可以考虑选择我的导入,根据我是否需要CPU或GPU即导入Data.Array.Accelerate.CUDA
或Data.Array.Repa
执行类型的操作Acc a
:
run :: Arrays a => Acc a -> a
Run Code Online (Sandbox Code Playgroud)
或者,使用类型类,例如:
main :: IO ()
main = do
let userChoice = Repa -- pretend this is a command line flag
action <- case userChoice of
Repa -> applyThreshold :: RepaBackend ()
Accelerate -> applyThreshold :: CudaBackend ()
action
Run Code Online (Sandbox Code Playgroud)
或者,对于我希望为CPU和GPU表达的每个并行阵列功能,我必须实现它两次 - 一次使用repa库,再一次使用加速库?
小智 9
简短的回答是,目前,您不幸需要编写两个版本.
但是,我们正致力于对Accelerate的CPU支持,这将消除对代码的Repa版本的需求.特别是,Accelerate最近获得了一个新的基于LLVM的后端,同时针对GPU和CPU:https://github.com/AccelerateHS/accelerate-llvm
这个新的后端仍然是不完整的,错误的和实验性的,但我们正计划将其作为当前CUDA后端的可行替代方案.
归档时间: |
|
查看次数: |
1267 次 |
最近记录: |