Gab*_*lez 20 performance serialization haskell
基准测试表明,对于cereal我的数据结构(详见下文),反序列化所需的时间比从驱动器读取相同数据所需的时间长100倍:
benchmarking Read
mean: 465.7050 us, lb 460.9873 us, ub 471.0938 us, ci 0.950
std dev: 25.79706 us, lb 22.19820 us, ub 30.81870 us, ci 0.950
found 4 outliers among 100 samples (4.0%)
4 (4.0%) high mild
variance introduced by outliers: 53.460%
variance is severely inflated by outliers
benchmarking Read + Decode
collecting 100 samples, 1 iterations each, in estimated 6.356502 s
mean: 68.85135 ms, lb 67.65992 ms, ub 70.05832 ms, ci 0.950
std dev: 6.134430 ms, lb 5.607914 ms, ub 6.755639 ms, ci 0.950
variance introduced by outliers: 74.863%
variance is severely inflated by outliers
Run Code Online (Sandbox Code Playgroud)
通过在我的程序中分析此数据结构的典型反序列化使用情况也可以支持这一点,其中98%的时间用于反序列化数据,1%是IO加上核心算法:
COST CENTRE MODULE %time %alloc
getWord8 Data.Serialize.Get 30.5 40.4
unGet Data.Serialize.Get 29.5 17.9
getWord64be Data.Serialize.Get 14.0 10.7
getListOf Data.Serialize.Get 10.2 12.8
roll Data.Serialize 8.2 11.5
shiftl_w64 Data.Serialize.Get 3.4 2.9
decode Data.Serialize 2.9 3.1
main Main 1.3 0.6
Run Code Online (Sandbox Code Playgroud)
我反序列化的数据结构是a IntMap [Triplet Atom],组件类型的定义如下:
type Triplet a = (a, a, a)
data Point = Point {
_x :: {-# UNPACK #-} !Double ,
_y :: {-# UNPACK #-} !Double ,
_z :: {-# UNPACK #-} !Double }
data Atom = Atom {
_serial :: {-# UNPACK #-} !Int ,
_r :: {-# UNPACK #-} !Point ,
_n :: {-# UNPACK #-} !Word64 }
Run Code Online (Sandbox Code Playgroud)
我使用的是默认的IntMap,(,,)并且[]实例提供cereal,而以下几种类型和实例为我的自定义类型:
instance Serialize Point where
put (Point x y z) = do
put x
put y
put z
get = Point <$> get <*> get <*> get
instance Serialize Atom where
put (Atom s r n) = do
put s
put r
put n
get = Atom <$> get <*> get <*> get
Run Code Online (Sandbox Code Playgroud)
所以我的问题是:
IntMap/ [])以使反序列化更快?Atom/ Point)以使反序列化更快?cerealHaskell 更快的替代方案,或者我应该将数据结构存储在C-land中以进行更快速的反序列化(即使用mmap)?我反序列化的这些文件被用于搜索引擎的子索引,因为完整索引不能适合目标计算机(这是一个消费级桌面)的内存,所以我将每个子索引存储在磁盘上并读取+解码驻留在内存中的初始全局索引所指向的子索引.此外,我不关心序列化速度,因为搜索索引是最终用户的瓶颈,并且当前的序列化性能cereal对于生成和更新索引是令人满意的.
编辑:
试过唐的建议使用节省空间的三联体,这个速度翻了四倍:
benchmarking Read
mean: 468.9671 us, lb 464.2564 us, ub 473.8867 us, ci 0.950
std dev: 24.67863 us, lb 21.71392 us, ub 28.39479 us, ci 0.950
found 2 outliers among 100 samples (2.0%)
2 (2.0%) high mild
variance introduced by outliers: 50.474%
variance is severely inflated by outliers
benchmarking Read + Decode
mean: 15.04670 ms, lb 14.99097 ms, ub 15.10520 ms, ci 0.950
std dev: 292.7815 us, lb 278.8742 us, ub 308.1960 us, ci 0.950
variance introduced by outliers: 12.303%
variance is moderately inflated by outliers
Run Code Online (Sandbox Code Playgroud)
然而,它仍然是使用比IO多25倍的时间瓶颈.此外,任何人都可以解释为什么唐的建议有效吗?这是否意味着如果我切换到列表以外的其他内容(如数组?)它也可能会有所改进?
编辑#2:刚刚切换到最新的Haskell平台并重新分析谷物.这些信息要详细得多,我已经提供了一些信息.
好.用建议的摘要来回答这个问题.为了快速反序列化数据:
cereal(strict binarybytestring output)或(lazy bytetring output)| 归档时间: |
|
| 查看次数: |
2042 次 |
| 最近记录: |