我有一个文本文件,每行一个句子.我想使用hunspell(-s选项)对每行中的世界进行lemmatize.由于我想分别对每行的引理进行处理,因此将整个文本文件提交给hunspell是没有意义的.我需要一个接一个地发送一行,并为每一行提供hunspell输出.
以下是如何处理Steel Bank Common Lisp中的输入和输出流的答案?,我能够为hunspell发送一行接一行的整个文本文件但是我无法捕获每行的hunspell输出.如何在发送另一条线之前与发送线路和读取输出的进程进行交互?
我当前读取整个文本文件的代码是
(defun parse-spell-sb (file-in)
(with-open-file (in file-in)
(let ((p (sb-ext:run-program "/opt/local/bin/hunspell" (list "-i" "UTF-8" "-s" "-d" "pt_BR")
:input in :output :stream :wait nil)))
(when p
(unwind-protect
(with-open-stream (o (process-output p))
(loop
:for line := (read-line o nil nil)
:while line
:collect line))
(process-close p))))))
Run Code Online (Sandbox Code Playgroud)
再一次,这段代码为我提供了整个文本文件的hunspell输出.我想分别为每个输入行输出hunspell.
任何的想法?
我想你想要运行的程序存在缓冲问题.例如:
(defun program-stream (program &optional args)
(let ((process (sb-ext:run-program program args
:input :stream
:output :stream
:wait nil
:search t)))
(when process
(make-two-way-stream (sb-ext:process-output process)
(sb-ext:process-input process)))))
Run Code Online (Sandbox Code Playgroud)
现在,在我的系统上,这将适用于cat:
CL-USER> (defparameter *stream* (program-stream "cat"))
*STREAM*
CL-USER> (format *stream* "foo bar baz~%")
NIL
CL-USER> (finish-output *stream*) ; will hang without this
NIL
CL-USER> (read-line *stream*)
"foo bar baz"
NIL
CL-USER> (close *stream*)
T
Run Code Online (Sandbox Code Playgroud)
注意finish-output- 如果没有这个,读取将挂起.(还有force-output.)
处于交互模式的Python也可以工作:
CL-USER> (defparameter *stream* (program-stream "python" '("-i")))
*STREAM*
CL-USER> (loop while (read-char-no-hang *stream*)) ; skip startup message
NIL
CL-USER> (format *stream* "1+2~%")
NIL
CL-USER> (finish-output *stream*)
NIL
CL-USER> (read-line *stream*)
"3"
NIL
CL-USER> (close *stream*)
T
Run Code Online (Sandbox Code Playgroud)
但是如果你在没有-i选项的情况下尝试这个(或类似的选项-u),你可能会因为缓冲而失去运气.例如,在我的系统上,读取tr将挂起:
CL-USER> (defparameter *stream* (program-stream "tr" '("a-z" "A-Z")))
*STREAM*
CL-USER> (format *stream* "foo bar baz~%")
NIL
CL-USER> (finish-output *stream*)
NIL
CL-USER> (read-line *stream*) ; hangs
; Evaluation aborted on NIL.
CL-USER> (read-char-no-hang *stream*)
NIL
CL-USER> (close *stream*)
T
Run Code Online (Sandbox Code Playgroud)
由于tr没有提供关闭缓冲的开关,我们将使用pty包装器封装调用(在本例中unbuffer为expect):
CL-USER> (defparameter *stream* (program-stream "unbuffer"
'("-p" "tr" "a-z" "A-Z")))
*STREAM*
CL-USER> (format *stream* "foo bar baz~%")
NIL
CL-USER> (finish-output *stream*)
NIL
CL-USER> (read-line *stream*)
"FOO BAR BAZ
"
NIL
CL-USER> (close *stream*)
T
Run Code Online (Sandbox Code Playgroud)
所以,长话短说:finish-output在阅读之前尝试在流上使用.如果这不起作用,请检查阻止缓冲的命令行选项.如果它仍然不起作用,你可以尝试将程序包装在某种pty-wrapper中.