5 clojure
例如,如果我有2个包含书签数据的管道分隔文件.如何读入数据然后确定两组数据的差异?
2 | www.cnn.com |新闻|这是CNN
3 | www.msnbc.com |搜索|
4 | news.ycombinator.com |新闻|技术新闻
5 | bing.com |搜索|竞争者
1 | www.google.com |搜索|搜索之王
2 | www.cnn.com |新闻|这是CNN
3 | www.msnbc.com |搜索|新评论
4 | news.ycombinator.com |新闻|技术新闻
集#1中缺少Id#1
集#2中缺少Id#5
Id#3不同:
- > www.msnbc.com |搜索|
- > www.msnbc.com |搜索|新评论
(use '[clojure.contrib str-utils duck-streams pprint]
'[clojure set])
(defn read-bookmarks [filename]
(apply hash-map
(mapcat #(re-split #"\|" % 2)
(read-lines filename))))
(defn diff-bookmarks [filename1 filename2]
(let [f1 (read-bookmarks filename1)
f2 (read-bookmarks filename2)
k1 (set (keys f1))
k2 (set (keys f2))
missing-in-1 (difference k2 k1)
missing-in-2 (difference k1 k2)
present-but-different (filter #(not= (f1 %) (f2 %))
(intersection k1 k2))]
(cl-format nil "~{Id #~a is missing in set #1~%~}~{Id #~a is missing in set #2~%~}~{~{Id #~a is different~% -> ~a~% -> ~a~%~}~}"
missing-in-1
missing-in-2
(map #(list % (f1 %) (f2 %))
present-but-different))))
(print (diff-bookmarks "bookmarks.csv" "bookmarks2.csv"))