我正在从一个非常大的文本文件中读取行.该文件包含一组我想从中选择特定行号的数据.我想要做的是从文件中读取一行,如果该行是我想要的行,请将其与我的结果联系起来,如果不是,则检查下一行.我不想存储我在内存中看到的所有行,所以我想在阅读它们时将它们从阅读器行中删除.
我有这样的功能:
;; evaluates but doesn't modify the line sequence so continuously adds
;; the same first line to the result. I would like this exact function
;; but somehow have it drop the first line of lines at each iteration.
(defn get-training-data [batch-size batch-num]
(let [line-numbers (fn that returns vector of random numbers)]
(with-open [rdr (clojure.java.io/reader "resources/sample.txt")]
(let [lines (line-seq rdr) res []]
(for [i (range (apply max line-numbers))
:let [res (conj res (json/read-str (first lines)))]
:when (some #{i} line-numbers)]
res)))))
Run Code Online (Sandbox Code Playgroud)
我也有这样的功能:
;;this works as I want it to, but only with a small file and produces a
;;stack overflow with a large file
(defn get-training-data1 [batch-size batch-num]
(let [line-numbers (fn that returns a vector of random numbers)]
(with-open [rdr (clojure.java.io/reader "resources/sample.txt")]
(let [lines (line-seq rdr)]
(loop [i 0 f (apply max line-numbers) res [] lines lines]
(if (> i f)
res
(if (some #{i} line-numbers)
(recur
(inc i)
f
(conj res (json/read-str (first lines)))
(drop 1 lines))
(recur
(inc i)
f
res
(drop 1 lines)))))))))
Run Code Online (Sandbox Code Playgroud)
当我试图测试这个时,我开发了以下更简单的情况:
;;works
(let [res []]
(for [i (range 10)
:let [res (conj res i)]
:when (odd? i)]
res)) ;;([1] [3] [5] [7] [9])
;;now an attempt to get the same result but have a side effect each time,
;;produces null pointer exception.
(let [res []]
(for [i (range 10)
:let [res (conj res i)]
:when (odd? i)]
(doall
(println i)
res)))
Run Code Online (Sandbox Code Playgroud)
我相信如果我能弄清楚如何在for中产生副作用,那么第一个问题就会被解决,因为我可以让副作用放弃读者行序列的第一行.
你们有什么想法吗?
map和filter可以很好地完成这项工作并使其保持懒惰状态,因此您不再需要存储在内存中.
user> (->> (line-seq (clojure.java.io/reader "project.clj")) ;; lazy sequence of lines
(map vector (range)) ;; add an index
(filter #(#{1 3 7 9} (first %))) ;; filter by index
(map second )) ;; drop the index
(" :description \"API server for Yummly mobile app(s)\""
"[com.project/example \"1.4.8-SNAPSHOT\"]"
" [org.clojure/tools.cli \"0.2\.4\"]"
" [clojurewerkz/mailer \"1.0.0-alpha3\"]")
Run Code Online (Sandbox Code Playgroud)