gam*_*zzi 1 clojure hashmap counting
Clojure:是否有比以下方法更好的方法(或更习惯于该语言)将字符串的字母计入哈希图中:
; Clojure 1.10.1
user=> (def s1 "A string with some letters")
user=> (def d1 (apply merge-with + (map #(hash-map % 1) (seq s1))))
user=> d1
{\space 4, \A 1, \e 3, \g 1, \h 1, \i 2, \l 1, \m 1, \n 1, \o 1, \r 2, \s 3, \t 4, \w 1}
Run Code Online (Sandbox Code Playgroud)
?
使用频率(clojure.core的一部分)
说明:https://clojuredocs.org/clojure.core/frequencies
实现:https://github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L7123
(def s1 "A string with some letters")
(def d1 (apply merge-with + (map #(hash-map % 1) (seq s1))))
(def d2 (frequencies s1))
(println d1) ; { 4, A 1, e 3, g 1, h 1, i 2, l 1, m 1, n 1, o 1, r 2, s 3, t 4, w 1}
(println d2) ; { 4, A 1, e 3, g 1, h 1, i 2, l 1, m 1, n 1, o 1, r 2, s 3, t 4, w 1}
(println (= d1 d2)) ; true
Run Code Online (Sandbox Code Playgroud)
性能
(defn rand-str [len]
" Generates Random String "
(apply str (take len (repeatedly #(char (+ (rand 26) 65))))))
(def s (rand-str 100000))
(time (frequencies s)) # ~100 ms
(time (apply merge-with + (map #(hash-map % 1) (seq s)))) # ~600 ms
Run Code Online (Sandbox Code Playgroud)
因此,使用100,000个元素时,频率提高约6倍
频率的更好性能可能是由于使用了瞬态数据结构。