5 clojure
例如,如果我有2个包含书签数据的管道分隔文件.如何读入数据然后确定两组数据的差异?
2 | www.cnn.com |新闻|这是CNN
3 | www.msnbc.com |搜索|
4 | news.ycombinator.com |新闻|技术新闻
5 | bing.com |搜索|竞争者
1 | www.google.com |搜索|搜索之王
2 | www.cnn.com |新闻|这是CNN
3 | www.msnbc.com |搜索|新评论
4 | news.ycombinator.com |新闻|技术新闻
集#1中缺少Id#1
集#2中缺少Id#5
Id#3不同:
- > www.msnbc.com |搜索|
- > www.msnbc.com |搜索|新评论
(use '[clojure.contrib str-utils duck-streams pprint] '[clojure set]) (defn read-bookmarks [filename] (apply hash-map (mapcat #(re-split #"\|" % 2) (read-lines filename)))) (defn diff-bookmarks [filename1 filename2] (let [f1 (read-bookmarks filename1) f2 (read-bookmarks filename2) k1 (set (keys f1)) k2 (set (keys f2)) missing-in-1 (difference k2 k1) missing-in-2 (difference k1 k2) present-but-different (filter #(not= (f1 %) (f2 %)) (intersection k1 k2))] (cl-format nil "~{Id #~a is missing in set #1~%~}~{Id #~a is missing in set #2~%~}~{~{Id #~a is different~% -> ~a~% -> ~a~%~}~}" missing-in-1 missing-in-2 (map #(list % (f1 %) (f2 %)) present-but-different)))) (print (diff-bookmarks "bookmarks.csv" "bookmarks2.csv"))