在Clojure中更改ref时,为什么通勤函数被调用两次?

Sam*_*ies 8 clojure stm

我想我理解在Clojure交易中通勤和改变的想法之间的基本区别.

更改基本上从事务的开始到结束"锁定"身份,以便多个事务必须按顺序执行.

Commute仅将锁应用于身份的实际值更改,以便事务中的其他操作可以在不同的时间运行,并具有不同的世界视图.

但我对某些事感到困惑.让我们定义一个带有副作用的函数和一个参与的函数:

(defn fn-with-side-effects [state]
    (println "Hello!")
    (inc state))

(def test-counter (ref 0))
Run Code Online (Sandbox Code Playgroud)

现在,如果我们使用alter,我们会看到预期的行为:

user=> (dosync (alter test-counter fn-with-side-effects))
Hello!
1
Run Code Online (Sandbox Code Playgroud)

但是如果我们使用通勤:

user=> (dosync (ref-set test-counter 0))
0
user=> (dosync (commute test-counter fn-with-side-effects))
Hello!
Hello! ; hello is printed twice!
1
Run Code Online (Sandbox Code Playgroud)

因此在通勤版本中,该函数显然仅修改ref一次,因为最终值为1.但是修改器函数的副作用执行两次.为什么会这样?

小智 9

我做了一些实验,以了解它是如何commute工作的.我想将我的解释分为三部分:

  • 比较和设置语义
  • alter
  • commute

比较和设置语义

我认为Clojure for Brave and True已经很好地解释了它:

swap! 实现"比较和设置"语义,这意味着它在内部执行以下操作:

  1. 它读取原子的当前状态
  2. 然后,它将更新功能应用于该状态
  3. 接下来,它检查它在步骤1中读取的值是否与原子的当前值相同
  4. 如果是,那就交换!更新atom以引用步骤2的结果
  5. 如果不是,那就换掉吧!重试,再次通过步骤1完成该过程.

swap!atom,但我们知道这将帮助我们了解altercommute,因为他们得到了使用类似的方法来更新ref.

不像atom,ref修改(通过alter,commute,ref-set)具有在事务中被包装.当事务开始(或重试)时,它将捕获所有包含的快照ref(因为alter需要它).ref只有在提交事务时才会修改.

alter

在一个事务中,所有ref将通过alter组形式进行修改.如果组中的任何一个ref未通知其更改,则将重试事务.基本上alter做到以下几点:

  1. Compare its altering ref to the snapshot captured by transaction. If they look different, retry the transaction; else
  2. Create a new state from the snapshot using function provided.
  3. Compare ref to the snapshot again. If they look different, retry the transaction; else
  4. Try to write-lock the ref, don't let anybody modify it until end of this transaction trial. If failed (ref has already been locked), wait for a while (e.g. 100 ms), then retry the transaction.
  5. Tell transaction to update this ref into new state when it perform commission.

Let's demonstrate a smooth alteration. First, we are going to create a thread t1 to alter 3 counters c1, c2 and c3 with slow-inc.

(ns testing.core)

(def start (atom 0)) ; Record start time.

(def c1 (ref 0)) ; Counter 1
(def c2 (ref 0)) ; Counter 2
(def c3 (ref 0)) ; Counter 3

(defn milliTime 
  "Get current time in millisecond."
  []
  (int (/ (System/nanoTime) 1000000)))

(defn lap 
  "Get elapse time since 'start' in millisecond."
  []
  (- (milliTime) @start))

(defn slow-inc
  "Slow increment, takes 1 second."
  [x x-name]
  (println "slow-inc beg" x-name ":" x "|" (lap) "ms")
  (Thread/sleep 1000)
  (println "slow-inc end" x-name ":" (inc x) "|" (lap) "ms")
  (inc x))

(defn fast-inc
  "Fast increment. The value it prints is incremented."
  [x x-name]
  (println "fast-inc    " x-name ":" (inc x) "|" (lap) "ms")
  (inc x))

(defn -main
  []
  ;; Initialize c1, c2, c3 and start.
  (dosync (ref-set c1 0) 
          (ref-set c2 0)
          (ref-set c3 0))
  (reset! start (milliTime))

  ;; Start two new threads simultaneously.
  (let [t1 (future
             (dosync
               (println "transaction start   |" (lap) "ms")
               (alter c1 slow-inc "c1")
               (alter c2 slow-inc "c2")
               (alter c3 slow-inc "c3")
               (println "transaction end     |" (lap) "ms")))
        t2 (future)]

    ;; Dereference all of them (wait until all 2 threads finish).
    @t1 @t2 

    ;; Print final counters' values.
    (println "c1 :" @c1)
    (println "c2 :" @c2)
    (println "c3 :" @c3)))
Run Code Online (Sandbox Code Playgroud)

And we got this:

transaction start   | 3 ms    ; 1st try
slow-inc beg c1 : 0 | 8 ms
slow-inc end c1 : 1 | 1008 ms
slow-inc beg c2 : 0 | 1009 ms
slow-inc end c2 : 1 | 2010 ms
slow-inc beg c3 : 0 | 2010 ms
slow-inc end c3 : 1 | 3011 ms
transaction end     | 3012 ms
c1 : 1
c2 : 1
c3 : 1
Run Code Online (Sandbox Code Playgroud)

Smooth procedure. No surprise.

Let's see what will happen if a ref (let's say, c3) is modified before its alteration ((alter c3 ...)). We'll modify it during alteration of c1. Edit the let's binding of t2 into:

t2 (future
     (Thread/sleep 900) ; Increment at 900 ms
     (dosync (alter c3 fast-inc "c3")))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 2 ms    ; 1st try
slow-inc beg c1 : 0 | 7 ms
fast-inc     c3 : 1 | 904 ms  ; c3 being modified in thread t2
slow-inc end c1 : 1 | 1008 ms
slow-inc beg c2 : 0 | 1009 ms
slow-inc end c2 : 1 | 2010 ms
transaction start   | 2011 ms ; 2nd try
slow-inc beg c1 : 0 | 2011 ms
slow-inc end c1 : 1 | 3012 ms
slow-inc beg c2 : 0 | 3013 ms
slow-inc end c2 : 1 | 4014 ms
slow-inc beg c3 : 1 | 4015 ms
slow-inc end c3 : 2 | 5016 ms
transaction end     | 5016 ms
c1 : 1
c2 : 1
c3 : 2
Run Code Online (Sandbox Code Playgroud)

As you can see, in step 1 of 1st-try-(alter c3 ...), it realize c3 (val = 1) looks different to the snapshot (val = 0) captured by transaction, so it retry the transaction.

Now, what if a ref (let's say, c1) was modified during its alteration ((alter c1 ...))? We will modify c1 on thread t2. Edit the let's binding of t2 into:

t2 (future
     (Thread/sleep 900) ; Increment at 900 ms
     (dosync (alter c1 fast-inc "c1")))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms    ; 1st try
slow-inc beg c1 : 0 | 8 ms
fast-inc     c1 : 1 | 904 ms  ; c1 being modified in thread t2
slow-inc end c1 : 1 | 1008 ms
transaction start   | 1009 ms ; 2nd try
slow-inc beg c1 : 1 | 1009 ms
slow-inc end c1 : 2 | 2010 ms
slow-inc beg c2 : 0 | 2011 ms
slow-inc end c2 : 1 | 3011 ms
slow-inc beg c3 : 0 | 3012 ms
slow-inc end c3 : 1 | 4013 ms
transaction end     | 4014 ms
c1 : 2
c2 : 1
c3 : 1
Run Code Online (Sandbox Code Playgroud)

This time, in step 3 of 1st-try-(alter c1 ...), it find out that ref has been modified, so it call for a transaction retry.

Now, let's try to modify a ref (let's say, c1) after its alteration ((alter c1 ...)). We'll modify it during alteration of c2.

t2 (future
     (Thread/sleep 1600) ; Increment at 1600 ms
     (dosync (alter c1 fast-inc "c1")))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms    ; 1st try
slow-inc beg c1 : 0 | 8 ms
slow-inc end c1 : 1 | 1009 ms
slow-inc beg c2 : 0 | 1010 ms
fast-inc     c1 : 1 | 1604 ms ; try to modify c1 in thread t2, but failed
fast-inc     c1 : 1 | 1705 ms ; keep trying...
fast-inc     c1 : 1 | 1806 ms
fast-inc     c1 : 1 | 1908 ms
fast-inc     c1 : 1 | 2009 ms
slow-inc end c2 : 1 | 2011 ms
slow-inc beg c3 : 0 | 2012 ms
fast-inc     c1 : 1 | 2110 ms ; still trying...
fast-inc     c1 : 1 | 2211 ms
fast-inc     c1 : 1 | 2312 ms
fast-inc     c1 : 1 | 2413 ms
fast-inc     c1 : 1 | 2514 ms
fast-inc     c1 : 1 | 2615 ms
fast-inc     c1 : 1 | 2716 ms
fast-inc     c1 : 1 | 2817 ms
fast-inc     c1 : 1 | 2918 ms ; and trying....
slow-inc end c3 : 1 | 3012 ms
transaction end     | 3013 ms ; 1st try ended, transaction committed.
fast-inc     c1 : 2 | 3014 ms ; finally c1 modified successfully
c1 : 2
c2 : 1
c3 : 1
Run Code Online (Sandbox Code Playgroud)

Since 1st-try-(alter c1 ...) has locked c1 (step 4), no one can modify c1, not until this round of transaction trial ended.

That's it for alter.

So, what if we don't want c1, c2, c3 all group together? Let's say I want to retry the transaction only when c1 or c3 fail their alteration (being modified by other thread during the transaction). I don't care the state of c2. There's no need to retry the transaction if c2 is modified during the transaction, so that I can save some time. How do we achieve that? Yes, through commute.

commute

Basically, commute does the following:

  1. Run the function provided using ref directly (not from snapshot) but don't do anything with the result.
  2. Ask the transaction to call real-commute with the same arguments before transaction commits. (real-commute is just a made up name by me.)

I'm don't actually know why commute has to run step 1. It seems to me step 2 alone is enough. real-commute does the following:

  1. Read-and-write-lock the ref until end of this transaction trial if it hasn't been locked, else retry the transaction.
  2. Create a new state from ref using given function.
  3. Tell transaction to update this ref into new state when it perform commission.

Let's examine it. Edit the let's binding into:

t1 (future
     (dosync
       (println "transaction start   |" (lap) "ms")
       (alter c1 slow-inc "c1")
       (commute c2 slow-inc "c2") ; changed to commute
       (alter c3 slow-inc "c3")
       (println "transaction end     |" (lap) "ms")))
t2 (future)
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms
slow-inc beg c1 : 0 | 7 ms    ; called by alter
slow-inc end c1 : 1 | 1008 ms
slow-inc beg c2 : 0 | 1009 ms ; called by commute
slow-inc end c2 : 1 | 2009 ms
slow-inc beg c3 : 0 | 2010 ms ; called by alter
slow-inc end c3 : 1 | 3011 ms
transaction end     | 3012 ms
slow-inc beg c2 : 0 | 3012 ms ; called by real-commute
slow-inc end c2 : 1 | 4012 ms
c1 : 1
c2 : 1
c3 : 1
Run Code Online (Sandbox Code Playgroud)

So slow-inc get called twice if you use commute, once by commute and once by real-commute before transaction commit. The first commute didn't do anything with slow-inc's result.

slow-inc can get called more than twice. For example, let's try to modify c3 on thread t2:

t2 (future
     (Thread/sleep 500) ; modify c3 at 500 ms
     (dosync (alter c3 fast-inc "c3")))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 2 ms
slow-inc beg c1 : 0 | 8 ms
fast-inc     c3 : 1 | 504 ms  ; c3 modified at thread t2
slow-inc end c1 : 1 | 1008 ms
slow-inc beg c2 : 0 | 1009 ms ; 1st time
slow-inc end c2 : 1 | 2010 ms
transaction start   | 2012 ms
slow-inc beg c1 : 0 | 2012 ms
slow-inc end c1 : 1 | 3013 ms
slow-inc beg c2 : 0 | 3014 ms ; 2nd time
slow-inc end c2 : 1 | 4015 ms
slow-inc beg c3 : 1 | 4016 ms
slow-inc end c3 : 2 | 5016 ms
transaction end     | 5017 ms
slow-inc beg c2 : 0 | 5017 ms ; 3rd time
slow-inc end c2 : 1 | 6018 ms
c1 : 1
c2 : 1
c3 : 2
Run Code Online (Sandbox Code Playgroud)

In first trial of transaction, after (commute c2 ...) has been evaluated, (alter c3 ...) found out that c3 was different from the snapshot, thus triggered transaction to retry. If (alter c3 ...) is before (commute c2 ...), then retrial will be triggered before evaluation or (commute c2 ..). So, placing all commutes after all alters may save you some time.

Let's see what will happen if we modify c2 in thread t2 while transaction in t1 is being evaluated.

t2 (future
     (Thread/sleep 500) ; before evaluation of (commute c2 ...)
     (dosync (alter c2 fast-inc "c2"))
     (Thread/sleep 1000) ; during evaluation of (commute c2 ...)
     (dosync (alter c2 fast-inc "c2"))
     (Thread/sleep 1000) ; after evaluation of (commute c2 ...)
     (dosync (alter c2 fast-inc "c2")))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms
slow-inc beg c1 : 0 | 9 ms
fast-inc     c2 : 1 | 504 ms  ; before
slow-inc end c1 : 1 | 1009 ms
slow-inc beg c2 : 1 | 1010 ms
fast-inc     c2 : 2 | 1506 ms ; during
slow-inc end c2 : 2 | 2011 ms
slow-inc beg c3 : 0 | 2012 ms
fast-inc     c2 : 3 | 2508 ms ; after
slow-inc end c3 : 1 | 3013 ms
transaction end     | 3013 ms
slow-inc beg c2 : 3 | 3014 ms
slow-inc end c2 : 4 | 4014 ms
c1 : 1
c2 : 4
c3 : 1
Run Code Online (Sandbox Code Playgroud)

As you can see, no transaction retrial, and c2 still updated to our expected value (4), thanks to real-commute.

Now I want to demonstrate the effect of step 1 in real-commute: its ref is read-and-write-locked. First, to confirm it is read-locked:

t2 (future
     (Thread/sleep 3500) ; during real-commute
     (println "try to read c2:" @c2 "  |" (lap) "ms"))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms
slow-inc beg c1 : 0 | 9 ms
slow-inc end c1 : 1 | 1010 ms
slow-inc beg c2 : 0 | 1010 ms
slow-inc end c2 : 1 | 2011 ms
slow-inc beg c3 : 0 | 2012 ms
slow-inc end c3 : 1 | 3012 ms
transaction end     | 3013 ms
slow-inc beg c2 : 0 | 3013 ms
slow-inc end c2 : 1 | 4014 ms
try to read c2: 1   | 4015 ms ; got printed after transaction trial ended
c1 : 1
c2 : 1
c3 : 1
Run Code Online (Sandbox Code Playgroud)

The @c2 is blocked until c2 got unlocked. That's why println got evaluated after 4000 ms even though our order is sleep for 3500 ms.

Since commute and alter need to read their ref to perform given function, they will be blocked until their ref is unlocked too. You can try to replace (println ...) with (alter c2 fast-inc "c2"). The effect should be same as this example.

So, in order to confirm it is write-locked, we can use ref-set:

t2 (future
     (Thread/sleep 3500) ; during real-commute
     (dosync (ref-set c2 (fast-inc 9 " 8"))))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms
slow-inc beg c1 : 0 | 8 ms
slow-inc end c1 : 1 | 1008 ms
slow-inc beg c2 : 0 | 1010 ms
slow-inc end c2 : 1 | 2011 ms
slow-inc beg c3 : 0 | 2012 ms
slow-inc end c3 : 1 | 3013 ms
transaction end     | 3014 ms
slow-inc beg c2 : 0 | 3014 ms
fast-inc      8 : 9 | 3504 ms ; try to ref-set but failed
fast-inc      8 : 9 | 3605 ms ; try again...
fast-inc      8 : 9 | 3706 ms
fast-inc      8 : 9 | 3807 ms
fast-inc      8 : 9 | 3908 ms
fast-inc      8 : 9 | 4009 ms
slow-inc end c2 : 1 | 4015 ms
fast-inc      8 : 9 | 4016 ms ; finally success, c2 ref-set to 9
c1 : 1
c2 : 9
c3 : 1
Run Code Online (Sandbox Code Playgroud)

From here you can also guess what ref-set does:

  • If its ref has been write-locked, retry the transaction after some time (e.g. 100 ms); else tell the transaction to update this ref into given value when it perform commission.

real-commute can fail too, when its ref has been locked on step 1. Unlike alter or ref-set, it doesn't wait for some time before retrying the transaction. This may cause problem if the ref is locked for too long. For example, we'll try to modify c1 after its alteration, using commute:

t2 (future
     (Thread/sleep 2500) ; during alteration of c3
     (dosync (commute c1 fast-inc "c1")))
Run Code Online (Sandbox Code Playgroud)

Result:

transaction start   | 3 ms
slow-inc beg c1 : 0 | 8 ms
slow-inc end c1 : 1 | 1008 ms
slow-inc beg c2 : 0 | 1010 ms
slow-inc end c2 : 1 | 2011 ms
slow-inc beg c3 : 0 | 2012 ms
fast-inc     c1 : 1 | 2506 ms
fast-inc     c1 : 1 | 2506 ms
fast-inc     c1 : 1 | 2506 ms
...

Exception in thread "main" java.util.concurrent.ExecutionException:
  java.lang.RuntimeException: Transaction failed after reaching retry
  limit, compiling: ...
Run Code Online (Sandbox Code Playgroud)

Recall that c1 is write-locked by alter after its alteration, therefore real-commute keep failing and keep retrying the transaction. Without buffer time, it reached transaction retrial limit and boomed.

Note

commute helps to improve concurrency by letting user reduce ref that will cause transaction retrial, with the cost of calling given function at least twice to update its ref. There are some scenarios which commute can be slower than alter. For example, when the only thing to do in a transaction is update a ref, commute cost more than alter:

(def c (ref 0)) ; counter

(defn slow-inc
  [x]
  (Thread/sleep 1000)
  (inc x))

(defn add-2
  "Create two threads to slow-inc c simultaneously with func.
  func can be alter or commute."
  [func]
  (let [t1 (future (dosync (func c slow-inc)))
        t2 (future (dosync (func c slow-inc)))]
    @t1 @t2))

(defn -main
  [& args]
  (dosync (ref-set c 0))
  (time (add-2 alter))
  (dosync (ref-set c 0))
  (time (add-2 commute)))
Run Code Online (Sandbox Code Playgroud)

Result:

"Elapsed time: 2003.239891 msecs" ; alter
"Elapsed time: 4001.073448 msecs" ; commute
Run Code Online (Sandbox Code Playgroud)

Here are the procedures of alter:

  • 0 ms: t1's alter started.
  • 1 ms: t2's alter started.
  • 1000 ms: t1's alter successfully, t1 committed, c became 1.
  • 1001 ms: t2's alter found out that c is different from its snapshot (step 2), retry transaction.
  • 2001 ms: t2's alter successfully, t2 committed, c became 2.

And procedures of commute:

  • 0 ms: t1's commute started.
  • 1 ms: t2's commute started.
  • 1000 ms: t1's real-commute started. c is locked.
  • 1001 ms: t2's real-commute started. It found out that c has been locked, so it retry the transaction (step 1).
  • 1002 ms: t2's commute started, but c is locked, so it is blocked.
  • 2000 ms: t1's real-commute ended, transaction committed. c became 1. t2 is unblocked.
  • 3002 ms: t2's real-commute started.
  • 4002 ms: t2's real-commute ended, transaction committed. c became 2.

That's why commute is slower than alter in this example.

This may look contradict to example of commute from clojuredocs.org. The key difference is, in his example, the delay (100 ms) happened in body of transaction, but the delay happened in slow-inc in my example. This difference cause his real-commute phase run very fast, thus reducing the locking time and blocking time. Less locking time means less probability of retrial. That's why in his example, commute is faster than alter. Change his inc into slow-inc and you will get same observation as mine.

That's all.


Sam*_*ies 6

我想到了.

发生这种情况是因为通勤功能总是执行两次.

Commute允许比alter更多的潜在并发性,因为它不会在事务的整个持续时间内锁定身份.

相反,它在事务开始时读取标识的值一次,并且当调用通勤操作时,它返回应用于此值的通勤函数.

完全有可能这个值现在已经过时了,因为某些其他线程可能在事务开始和通勤函数执行之间的某个时间发生了变化.

但是,保持完整性是因为在实际修改ref时,在提交时执行了通勤功能.

该网站对这些差异有一个非常明确的解释:http://squirrel.pl/blog/2010/07/13/clojure-alter-vs-commute/

实际上,当调用commute时,它会立即返回在ref上运行函数的结果.在事务的最后,它再次执行计算,这次同步(如更改)更新ref.这就是为什么最终计数器值为51,即使最后一个线程打印45.

如果您的通勤功能有副作用,请小心,因为它们将被执行两次!