计算Clojure中字符串的字母

gam*_*zzi 1 clojure hashmap counting

Clojure:是否有比以下方法更好的方法(或更习惯于该语言)将字符串的字母计入哈希图中:

; Clojure 1.10.1
user=> (def s1 "A string with some letters")
user=> (def d1 (apply merge-with + (map #(hash-map % 1) (seq s1))))
user=> d1
{\space 4, \A 1, \e 3, \g 1, \h 1, \i 2, \l 1, \m 1, \n 1, \o 1, \r 2, \s 3, \t 4, \w 1}
Run Code Online (Sandbox Code Playgroud)

Dar*_*ylG 6

使用频率(clojure.core的一部分)

说明:https//clojuredocs.org/clojure.core/frequencies

实现:https//github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L7123

(def s1 "A string with some letters")
(def d1 (apply merge-with + (map #(hash-map % 1) (seq s1))))
(def d2 (frequencies s1))
(println d1) ; {  4, A 1, e 3, g 1, h 1, i 2, l 1, m 1, n 1, o 1, r 2, s 3, t 4, w 1}
(println d2) ; {  4, A 1, e 3, g 1, h 1, i 2, l 1, m 1, n 1, o 1, r 2, s 3, t 4, w 1}
(println (= d1 d2)) ; true
Run Code Online (Sandbox Code Playgroud)

性能

(defn rand-str [len]
 " Generates Random String "
  (apply str (take len (repeatedly #(char (+ (rand 26) 65))))))

(def s (rand-str 100000))

(time (frequencies s)) # ~100 ms
(time (apply merge-with + (map #(hash-map % 1) (seq s)))) # ~600 ms
Run Code Online (Sandbox Code Playgroud)

因此,使用100,000个元素时,频率提高约6倍

频率的更好性能可能是由于使用了瞬态数据结构。