为什么时间宏声称慢函数调用的运行时间非常短？

Question

为什么时间宏声称慢函数调用的运行时间非常短？

Pal*_*han 0 future clojure lazy-evaluation lazy-sequences

正在看clojure第 9 章底部的练习，寻找勇敢和真实的人（特别是搜索多个引擎并返回每个引擎的第一次命中的最后一个）

我用 slurp 部分嘲笑实际搜索是这样的：

(defn search-for
  [query engine]
  (Thread/sleep 2000)
  (format "https://www.%s.com/search?q%%3D%s", engine query))

Run Code Online (Sandbox Code Playgroud)

并实现了这样的行为：

(defn get-first-hit-from-each
  [query engines]
  (let [futs (map (fn [engine]
                    (future (search-for query engine))) engines)]
    (doall futs)
    (map deref futs)))

Run Code Online (Sandbox Code Playgroud)

（我知道这里的返回是一个列表，练习要求一个向量，但可以into为此做一个......）

但是当我在 REPL 中运行它时

(time (get-first-hit-from-each "gray+cat" '("google" "bing")))

Run Code Online (Sandbox Code Playgroud)

添加后似乎需要 2 秒doall（因为 map 返回一个惰性 seq，我认为除非我使用 seq，否则任何期货都不会启动，(last futs)似乎也有效）但是当我time在 REPL 中使用宏时，它会报告即使需要 2 秒，也几乎不消耗时间：

(time (get-first-hit-from-each "gray+cat" '("google" "bing")))
"Elapsed time: 0.189609 msecs"
("https://www.google.com/search?q%3Dgray+cat" "https://www.bing.com/search?q%3Dgray+cat")

Run Code Online (Sandbox Code Playgroud)

time这里的宏是怎么回事？

Answer 1

Cou*_*lok 5

TL;DR：懒惰的 seqs 不能很好地与time宏配合使用，并且您的函数get-first-hit-from-each返回一个懒惰的 seq。要使用惰性 seq ，请按照文档的建议time将它们包装在 a 中。有关更完整的思考过程，请参见下文：doall

以下是（source）中time宏的定义：clojure.core

(defmacro time
  "Evaluates expr and prints the time it took.  Returns the value of
 expr."
  {:added "1.0"}
  [expr]
  `(let [start# (. System (nanoTime))
         ret# ~expr]
     (prn (str "Elapsed time: " (/ (double (- (. System (nanoTime)) start#)) 1000000.0) " msecs"))
     ret#))

Run Code Online (Sandbox Code Playgroud)

注意宏如何保存exprin的返回值ret#，然后打印经过的时间？只有在那之后才会ret#返回。这里的关键是你的函数get-first-hit-from-each返回一个惰性序列（因为map返回一个惰性序列）：

(type (get-first-hit-from-each "gray+cat" '("google" "bing")))
;; => clojure.lang.LazySeq

Run Code Online (Sandbox Code Playgroud)

因此，当你这样做时(time (get-first-hit-from-each "gray+cat" '("google" "bing")))，保存在ret#一个惰性序列中，在我们尝试使用它的值之前它实际上不会被评估......

我们可以使用该realized?函数检查是否已经评估了惰性序列。因此，让我们time通过添加一行来调整宏以检查是否ret#已评估，紧接着打印经过的时间：

(defmacro my-time
  [expr]
  `(let [start# (. System (nanoTime))
         ret# ~expr]
     (prn (str "Elapsed time: " (/ (double (- (. System (nanoTime)) start#)) 1000000.0) " msecs"))
     (prn (realized? ret#)) ;; has the lazy sequence been evaluated?
     ret#))

Run Code Online (Sandbox Code Playgroud)

现在测试一下：

(my-time (get-first-hit-from-each "gray+cat" '("google" "bing")))
"Elapsed time: 0.223054 msecs"
false
;; => ("https://www.google.com/search?q%3Dgray+cat" "https://www.bing.com/search?q%3Dgray+cat")

Run Code Online (Sandbox Code Playgroud)

不...所以这就是time打印不准确的原因。在打印输出之前，实际上没有任何计算时间长的东西可以运行。

为了解决这个问题并获得准确的时间，我们需要确保对惰性 seq 的评估，这可以通过策略性地将 a 放置doall在一堆可能的位置来完成，无论是在您的函数中，还是包装map：

(defn get-first-hit-from-each
  [query engines]
  (let [futs (map (fn [engine]
                    (future (search-for query engine))) engines)]
    (doall futs)
    (doall (map deref futs))))
;; => #'propeller.core/get-first-hit-from-each

(time (get-first-hit-from-each "gray+cat" '("google" "bing")))
"Elapsed time: 2005.478689 msecs"
;; => ("https://www.google.com/search?q%3Dgray+cat" "https://www.bing.com/search?q%3Dgray+cat")

Run Code Online (Sandbox Code Playgroud)

或在内time，包装函数调用：

(time (doall (get-first-hit-from-each "gray+cat" '("google" "bing"))))

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，7 月前
查看次数：	95 次
最近记录：	5 年，7 月前