如何让不可变的F#更高效?

Dax*_*ohl 13 f#

我想用不可变的F#写一大块C#代码.它是一个设备监视器,当前的实现通过不断从串行端口获取数据并根据新数据更新成员变量来工作.我想把它转移到F#并获得不可变记录的好处,但我在概念验证实现中的第一次拍摄真的很慢.

open System
open System.Diagnostics

type DeviceStatus = { RPM         : int;
                      Pressure    : int;
                      Temperature : int }

// I'm assuming my actual implementation, using serial data, would be something like 
// "let rec UpdateStatusWithSerialReadings (status:DeviceStatus) (serialInput:string[])".
// where serialInput is whatever the device streamed out since the previous check: something like
// ["RPM=90","Pres=50","Temp=85","RPM=40","Pres=23", etc.]
// The device streams out different parameters at different intervals, so I can't just wait for them all to arrive and aggregate them all at once.
// I'm just doing a POC here, so want to eliminate noise from parsing etc.
// So this just updates the status's RPM i times and returns the result.
let rec UpdateStatusITimes (status:DeviceStatus) (i:int) = 
    match i with
    | 0 -> status
    | _ -> UpdateStatusITimes {status with RPM = 90} (i - 1)

let initStatus = { RPM = 80 ; Pressure = 100 ; Temperature = 70 }
let stopwatch = new Stopwatch()

stopwatch.Start()
let endStatus = UpdateStatusITimes initStatus 100000000
stopwatch.Stop()

printfn "endStatus.RPM = %A" endStatus.RPM
printfn "stopwatch.ElapsedMilliseconds = %A" stopwatch.ElapsedMilliseconds
Console.ReadLine() |> ignore
Run Code Online (Sandbox Code Playgroud)

这在我的机器上运行大约1400毫秒,而等效的C#代码(具有可变成员变量)在大约310毫秒运行.有没有办法在不失去不变性的情况下加快速度?我希望F#编译器会注意到initStatus和所有中间状态变量从未被重用,因此只是改变场景后面的那些记录,但我猜不是.

ild*_*arn 12

在F#社区中,只要不是公共接口的一部分,命令式代码和可变数据就不会受到诟病.即,只要您封装它并将其与其余代码隔离,使用可变数据就可以了.为此,我建议如下:

type DeviceStatus =
  { RPM         : int
    Pressure    : int
    Temperature : int }

// one of the rare scenarios in which I prefer explicit classes,
// to avoid writing out all the get/set properties for each field
[<Sealed>]
type private DeviceStatusFacade =
    val mutable RPM         : int
    val mutable Pressure    : int
    val mutable Temperature : int
    new(s) =
        { RPM = s.RPM; Pressure = s.Pressure; Temperature = s.Temperature }
    member x.ToDeviceStatus () =
        { RPM = x.RPM; Pressure = x.Pressure; Temperature = x.Temperature }

let UpdateStatusITimes status i =
    let facade = DeviceStatusFacade(status)
    let rec impl i =
        if i > 0 then
            facade.RPM <- 90
            impl (i - 1)
    impl i
    facade.ToDeviceStatus ()

let initStatus = { RPM = 80; Pressure = 100; Temperature = 70 }
let stopwatch = System.Diagnostics.Stopwatch.StartNew ()
let endStatus = UpdateStatusITimes initStatus 100000000
stopwatch.Stop ()

printfn "endStatus.RPM = %d" endStatus.RPM
printfn "stopwatch.ElapsedMilliseconds = %d" stopwatch.ElapsedMilliseconds
stdin.ReadLine () |> ignore
Run Code Online (Sandbox Code Playgroud)

这样,公共接口不受影响 - UpdateStatusITimes仍然接受并返回本质上不可变的DeviceStatus- 但在内部UpdateStatusITimes使用可变类来消除分配开销.

编辑:(回应评论)这是我通常喜欢的类的样式,使用主要构造函数和lets +属性而不是vals:

[<Sealed>]
type private DeviceStatusFacade(status) =
    let mutable rpm      = status.RPM
    let mutable pressure = status.Pressure
    let mutable temp     = status.Temperature
    member x.RPM         with get () = rpm      and set n = rpm      <- n
    member x.Pressure    with get () = pressure and set n = pressure <- n
    member x.Temperature with get () = temp     and set n = temp     <- n
    member x.ToDeviceStatus () =
        { RPM = rpm; Pressure = pressure; Temperature = temp }
Run Code Online (Sandbox Code Playgroud)

但对于简单的门面类,每个属性都是盲目的getter/setter,我觉得这有点单调乏味.

F#3+允许以下内容,但我仍然没有发现它是一个改进,个人(除非一个教条地避免字段):

[<Sealed>]
type private DeviceStatusFacade(status) =
    member val RPM         = status.RPM with get, set
    member val Pressure    = status.Pressure with get, set
    member val Temperature = status.Temperature with get, set
    member x.ToDeviceStatus () =
        { RPM = x.RPM; Pressure = x.Pressure; Temperature = x.Temperature }
Run Code Online (Sandbox Code Playgroud)

  • F#3.0将引入一种新语法,允许属性具有后备存储而不需要单独的`let`语句.`member val Property2 =""with get,set` http://msdn.microsoft.com/en-us/library/dd483467(v=vs.110).aspx (4认同)
  • @Dax:啊,是的,那肯定会这样做的.: - ]要明确的是,如果在附加调试器的情况下启动进程,JIT编译器会执行*zero*优化; 如果一个启动进程然后附加调试器,则启用优化. (3认同)
  • @Dax:请注意,你可以使用另一个带有可变字段而不是类的记录,并获得完全相同的效果,但缺点是如果你使用相同的字段名称作为非外观底层记录,那么它们每次都会变得模棱两可你创建一个记录或使用记录模式匹配,使任何一个记录更尴尬. (2认同)

kvb*_*kvb 7

这不会回答你的问题,但它可能值得退一步并考虑大局:

  1. 您认为这个用例的不可变数据结构的优势是什么?F#也支持可变数据结构.
  2. 你声称F#"非常慢" - 但它只比C#代码慢4.5倍,每秒更新超过7000万次......这对你的实际应用来说可能是不可接受的性能吗?您是否有特定的性能目标?有理由相信这种类型的代码会成为您应用程序的瓶颈吗?

设计总是需要权衡.您可能会发现,为了在短时间内记录许多更改,根据您的需要,不可变数据结构会产生令人无法接受的性能损失.另一方面,如果您有一些要求,例如同时跟踪多个旧版本的数据结构,那么不可变数据结构的好处可能会使它们具有吸引力,尽管性能会受到影响.

  • "只慢了4.5倍" - 你称之为'只'?:) (2认同)
  • @Robert - 它都是相对的......在某些情况下,5%的速度可能会过慢,但在其他情况下,10倍可能是可以接受的.考虑到在不可变场景中必须创建的对象数量与在可变情况下在循环中完成的大量工作相比,我认为4.5x减速并不可怕.在循环中执行任何数量的计算都应该使数字更加接近. (2认同)

pet*_*ebu 7

我怀疑你看到的性能问题是由于在循环的每次迭代中克隆记录时所涉及的块内存归零(加上分配它的时间可忽略不计并随后进行垃圾收集).您可以使用结构重写您的示例:

[<Struct>]
type DeviceStatus =
    val RPM : int
    val Pressure : int
    val Temperature : int
    new(rpm:int, pres:int, temp:int) = { RPM = rpm; Pressure = pres; Temperature = temp }

let rec UpdateStatusITimes (status:DeviceStatus) (i:int) = 
    match i with
    | 0 -> status
    | _ -> UpdateStatusITimes (DeviceStatus(90, status.Pressure, status.Temperature)) (i - 1)

let initStatus = DeviceStatus(80, 100, 70)
Run Code Online (Sandbox Code Playgroud)

现在,性能将接近于使用全局可变变量或重新定义UpdateStatusITimes status iUpdateStatusITimes rpm pres temp i.这只有在你的结构长度不超过16个字节时才有效,否则它将以与记录相同的缓慢方式被复制.

如果您在评论中暗示过,您打算将其用作共享内存多线程设计的一部分,那么您将需要在某些时候进行可变性.您的选择是a)每个参数的共享可变变量b)一个包含结构的共享可变变量或c)包含可变字段的共享外观对象(如ildjarn的答案).我会选择最后一个,因为它很好地封装并扩展到超过四个int字段.

  • @ildjarn,@ Dax:http://stackoverflow.com/questions/2437925/net-why-is-struct-better-with-being-less-than-16-bytes看来超过16个字节的结构不是使用MOV指令复制但使用块存储器复制.这可以解释为什么有四个以上的int字段对FSI的性能有如此大的影响.但它并不能解释为什么独立编译器不同. (2认同)

Jon*_*rop 5

使用如下元组比原始解决方案快 15 倍:

type DeviceStatus = int * int * int

let rec UpdateStatusITimes (rpm, pressure, temp) (i:int) = 
    match i with
    | 0 -> rpm, pressure, temp
    | _ -> UpdateStatusITimes (90,pressure,temp) (i - 1)

while true do
  let initStatus = 80, 100, 70
  let stopwatch = new Stopwatch()

  stopwatch.Start()
  let rpm,_,_ as endStatus = UpdateStatusITimes initStatus 100000000
  stopwatch.Stop()

  printfn "endStatus.RPM = %A" rpm
  printfn "Took %fs" stopwatch.Elapsed.TotalSeconds
Run Code Online (Sandbox Code Playgroud)

顺便说一句,你应该stopwatch.Elapsed.TotalSeconds在计时时使用。