Nic*_*las 5 string split common-lisp
如何在 Common Lisp 中通过分隔符拆分字符串,就像在 SPLIT-SEQUENCE 中所做的那样,但还要在字符串列表中添加分隔符?
例如,我可以写:
(split-string-with-delimiter #\. "a.bc.def.com")
结果是("a" "." "bc" "." "def" "." "com").
我试过下面的代码(make-adjustable-string制作一个可以用 扩展的字符串vector-push-extend):
(defun make-adjustable-string (s)
(make-array (length s)
:fill-pointer (length s)
:adjustable t
:initial-contents s
:element-type (array-element-type s)))
(defun split-str (string &key (delimiter #\ ) (keep-delimiters nil))
"Splits a string into a list of strings, with the delimiter still
in the resulting list."
(let ((words nil)
(current-word (make-adjustable-string "")))
(do* ((i 0 (+ i 1))
(x (char string i) (char string i)))
((= (+ i 1) (length string)) nil)
(if (eql delimiter x)
(unless (string= "" current-word)
(push current-word words)
(push (string delimiter) words)
(setf current-word (make-adjustable-string "")))
(vector-push-extend x current-word)))
(nreverse words)))
Run Code Online (Sandbox Code Playgroud)
但这不会打印出最后一个子字符串/单词。我不确定发生了什么。
提前感谢您的帮助!
如果您只是在寻找解决方案,而不是为了练习,您可以使用cl-ppcre:
CL-USER> (cl-ppcre:split "(\\.)" "a.bc.def.com" :with-registers-p t)
("a" "." "bc" "." "def" "." "com")
Run Code Online (Sandbox Code Playgroud)
像这样的东西?
subseq例子:
(defun split-string-with-delimiter (string
&key (delimiter #\ )
(keep-delimiters nil)
&aux (l (length string)))
(loop for start = 0 then (1+ pos)
for pos = (position delimiter string :start start)
; no more delimiter found
when (and (null pos) (not (= start l)))
collect (subseq string start)
; while delimiter found
while pos
; some content found
when (> pos start) collect (subseq string start pos)
; optionally keep delimiter
when keep-delimiters collect (string delimiter)))
Run Code Online (Sandbox Code Playgroud)
例子:
CL-USER 120 > (split-string-with-delimiter "..1.2.3.4.."
:delimiter #\. :keep-delimiters nil)
("1" "2" "3" "4")
CL-USER 121 > (split-string-with-delimiter "..1.2.3.4.."
:delimiter #\. :keep-delimiters t)
("." "." "1" "." "2" "." "3" "." "4" "." ".")
CL-USER 122 > (split-string-with-delimiter "1.2.3.4"
:delimiter #\. :keep-delimiters nil)
("1" "2" "3" "4")
CL-USER 123 > (split-string-with-delimiter "1.2.3.4"
:delimiter #\. :keep-delimiters t)
("1" "." "2" "." "3" "." "4")
Run Code Online (Sandbox Code Playgroud)
或修改为使用任何序列(列表、向量、字符串,...):
(defun split-sequence-with-delimiter (sequence delimiter
&key (keep-delimiters nil)
&aux (end (length sequence)))
(loop for start = 0 then (1+ pos)
for pos = (position delimiter sequence :start start)
; no more delimiter found
when (and (null pos) (not (= start end)))
collect (subseq sequence start)
; while delimiter found
while pos
; some content found
when (> pos start) collect (subseq sequence start pos)
; optionally keep delimiter
when keep-delimiters collect (subseq sequence pos (1+ pos))))
Run Code Online (Sandbox Code Playgroud)
问题出在 do* 循环的结束条件之后。当变量 i 到达字符串末尾时,do* 循环退出,但仍然有一个当前单词尚未添加到单词中。当满足结束条件时,您需要在退出循环之前将 x 添加到 current-word,然后将 current-word 添加到 Word:
(defun split-string-with-delimiter (string delimiter)
"Splits a string into a list of strings, with the delimiter still
in the resulting list."
(let ((words nil)
(current-word (make-adjustable-string "")))
(do* ((i 0 (+ i 1))
(x (char string i) (char string i)))
((>= (+ i 1) (length string)) (progn (vector-push-extend x current-word) (push current-word words)))
(if (eql delimiter x)
(unless (string= "" current-word)
(push current-word words)
(push (string delimiter) words)
(setf current-word (make-adjustable-string "")))
(vector-push-extend x current-word)))
(nreverse words)))
Run Code Online (Sandbox Code Playgroud)
但是,请注意,这个版本仍然存在错误,因为如果字符串的最后一个字符是分隔符,它将包含在最后一个单词中,即(split-string-with-delimiter "a.bc.def." #\.) => ("a" "." "bc" "." "def.")
我会让您添加此检查。
在任何情况下,您可能希望通过向前查找分隔符并一次性提取当前 i 和下一个分隔符之间的所有字符作为单个子字符串来提高效率。