提高字符串操作的速度

6 performance common-lisp

这是一个后续问题,有点像:写一个有效的字符串替换函数?.

在(虽然遥远)未来,我希望能够进行自然语言处理.当然,字符串操作的速度很重要.无意中,我偶然发现了这个测试:http://raid6.com.au/~onlyjob/posts/arena/ - 所有测试都有偏见,这也不例外.但是,这对我提出了重要的问题.所以我写了几个测试来看看我是怎么做的:

这是我的第一次尝试(我称之为#A):

#一个

(defun test ()
  (declare (optimize (debug 0) (safety 0) (speed 3)))
  (loop with addidtion = (concatenate 'string "abcdefgh" "efghefgh")
     and initial = (get-internal-real-time)
     for i from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
     for ln = (* (length addidtion) i)
     for accumulated = addidtion
     then (loop with concatenated = (concatenate 'string accumulated addidtion)
             for start = (search "efgh" concatenated)
             while start do (replace concatenated "____" :start1 start)
             finally (return concatenated))
     when (zerop (mod ln (* 1024 256))) do
       (format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
  (values))

(test)
Run Code Online (Sandbox Code Playgroud)

对结果cl-ppcre感到困惑,我试图使用- 我不知道我希望的是什么,但结果显得非常糟糕......以下是我用于测试的代码:

#B

(ql:quickload "cl-ppcre")

(defun test ()
  (declare (optimize (debug 0) (safety 0) (speed 3)))
  (loop with addidtion = (concatenate 'string "abcdefgh" "efghefgh")
     and initial = (get-internal-real-time)
     for i from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
     for ln = (* (length addidtion) i)
     for accumulated = addidtion
     then (cl-ppcre:regex-replace-all "efgh" (concatenate 'string accumulated addidtion) "____")
     when (zerop (mod ln (* 1024 256))) do
       (format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
  (values))

(test)
Run Code Online (Sandbox Code Playgroud)

好吧,那么,为了避免一些概括,我决定编写自己的,尽管有些天真的版本:

#C

(defun replace-all (input match replacement)
  (declare (type string input match replacement)
           (optimize (debug 0) (safety 0) (speed 3)))
  (loop with pattern fixnum = (1- (length match))
     with i fixnum = pattern
     with j fixnum = i
     with len fixnum = (length input) do
       (cond
         ((>= i len) (return input))
         ((zerop j)
          (loop do
               (setf (aref input i) (aref replacement j) i (1+ i))
               (if (= j pattern)
                   (progn (incf i pattern) (return))
                   (incf j))))
         ((char= (aref input i) (aref match j))
          (decf i) (decf j))
         (t (setf i (+ i 1 (- pattern j)) j pattern)))))

(defun test ()
  (declare (optimize (debug 0) (safety 0) (speed 3)))
  (loop with addidtion string = (concatenate 'string "abcdefgh" "efghefgh")
     and initial = (get-internal-real-time)
     for i fixnum from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
     for ln fixnum = (* (length addidtion) i)
     for accumulated string = addidtion
     then (replace-all (concatenate 'string accumulated addidtion) "efgh" "____")
     when (zerop (mod ln (* 1024 256))) do
       (format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
  (values))

(test)
Run Code Online (Sandbox Code Playgroud)

差不多一样cl-ppcre!现在,这太不可思议了!没有什么我可以在这里看到这样会导致如此糟糕的表现......而且它确实很糟糕:(

我意识到标准函数到目前为止表现最好,我查看了SBCL源代码,经过一些阅读后我得出了这个:

#D

(defun replace-all (input match replacement &key (start 0))
  (declare (type simple-string input match replacement)
           (type fixnum start)
           (optimize (debug 0) (safety 0) (speed 3)))
  (loop with input-length fixnum = (length input)
     and match-length fixnum = (length match)
     for i fixnum from 0 below (ceiling (the fixnum (- input-length start)) match-length) do
       (loop with prefix fixnum = (+ start (the fixnum (* i match-length)))
          for j fixnum from 0 below match-length do
            (when (<= (the fixnum (+ prefix j match-length)) input-length)
              (loop for k fixnum from (+ prefix j) below (the fixnum (+ prefix j match-length))
                 for n fixnum from 0 do
                   (unless (char= (aref input k) (aref match n)) (return))
                 finally
                   (loop for m fixnum from (- k match-length) below k
                      for o fixnum from 0 do
                        (setf (aref input m) (aref replacement o))
                      finally
                        (return-from replace-all
                          (replace-all input match replacement :start k))))))
       finally (return input)))

(defun test ()
  (declare (optimize (debug 0) (safety 0) (speed 3)))
  (loop with addidtion string = (concatenate 'string "abcdefgh" "efghefgh")
     and initial = (get-internal-real-time)
     for i fixnum from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
     for ln fixnum = (* (length addidtion) i)
     for accumulated string = addidtion
     then (replace-all (concatenate 'string accumulated addidtion) "efgh" "____")
     when (zerop (mod ln (* 1024 256))) do
       (format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
  (values))

(test)
Run Code Online (Sandbox Code Playgroud)

最后,我可以获胜,虽然对标准库只有很小一部分性能 - 但与几乎所有其他东西相比,它仍然非常糟糕......

这是表格的结果:

| SBCL #A   | SBCL #B   | SBCL #C    | SBCL #D   | C gcc 4 -O3 | String size |
|-----------+-----------+------------+-----------+-------------+-------------|
| 17.463 s  | 166.254 s | 28.924 s   | 16.46 s   | 1 s         | 256 Kb      |
| 68.484 s  | 674.629 s | 116.55 s   | 63.318 s  | 4 s         | 512 Kb      |
| 153.99 s  | gave up   | 264.927 s  | 141.04 s  | 10 s        | 768 Kb      |
| 275.204 s | . . . . . | 474.151 s  | 251.315 s | 17 s        | 1024 Kb     |
| 431.768 s | . . . . . | 745.737 s  | 391.534 s | 27 s        | 1280 Kb     |
| 624.559 s | . . . . . | 1079.903 s | 567.827 s | 38 s        | 1536 Kb     |
Run Code Online (Sandbox Code Playgroud)

现在,问题是:我做错了什么?这是Lisp字符串固有的东西吗?这可能通过......什么来缓解?

在远景中,我甚至考虑为字符串处理编写专门的库.如果问题不是我的坏代码,而是实现.这样做有意义吗?如果是,您会建议使用哪种语言?


编辑:只是为了记录,我现在正在尝试使用这个库:https://github.com/Ramarren/ropes来处理字符串连接.不幸的是,它没有替换函数,并且进行多次替换并不是非常简单.但是当我有东西时,我会更新这篇文章.


我试图稍微改变huaiyuan的变体来使用数组的填充指针而不是字符串连接(以实现类似于StringBuilderPaulo Madeira建议的东西.它可能会进一步优化,但我不确定类型/将使用哪种方法更快/将它值得重新定义类型*+让他们只对操作fixnumsigned-byte总之,这里的代码和基准:

(defun test/e ()
  (declare (optimize speed))
  (labels ((min-power-of-two (num)
             (declare (type fixnum num))
             (decf num)
             (1+
              (progn
                (loop for i fixnum = 1 then (the (unsigned-byte 32) (ash i 1))
                   while (< i 17) do
                     (setf num
                           (logior
                            (the fixnum
                                 (ash num (the (signed-byte 32)
                                               (+ 1 (the (signed-byte 32)
                                                         (lognot i)))))) num))) num)))
           (join (x y)
             (let ((capacity (array-dimension x 0))
                   (desired-length (+ (length x) (length y)))
                   (x-copy x))
               (declare (type fixnum capacity desired-length)
                        (type (vector character) x y x-copy))
               (when (< capacity desired-length)
                 (setf x (make-array
                          (min-power-of-two desired-length)
                          :element-type 'character
                          :fill-pointer desired-length))
                 (replace x x-copy))
               (replace x y :start1 (length x))
               (setf (fill-pointer x) desired-length) x))
           (seek (old str pos)
             (let ((q (position (aref old 0) str :start pos)))
               (and q (search old str :start2 q))))
           (subs (str old new)
             (loop for p = (seek old str 0) then (seek old str p)
                while p do (replace str new :start1 p))
             str))
    (declare (inline min-power-of-two join seek subs)
             (ftype (function (fixnum) fixnum) min-power-of-two))
    (let* ((builder
            (make-array 16 :element-type 'character
                        :initial-contents "abcdefghefghefgh"
                        :fill-pointer 16))
           (ini (get-internal-real-time)))
      (declare (type (vector character) builder))
      (loop for i fixnum below (+ 1000 (* 4 1024 1024 (/ (length builder))))
         for j = builder then
           (subs (join j builder) "efgh" "____")
         for k fixnum = (* (length builder) i)
         when (= 0 (mod k (* 1024 256)))
         do (format t "~&~8,2F sec ~8D kB"
                    (/ (- (get-internal-real-time) ini) 1000)
                    (/ k 1024))))))
Run Code Online (Sandbox Code Playgroud)
    1.68 sec      256 kB
    6.63 sec      512 kB
   14.84 sec      768 kB
   26.35 sec     1024 kB
   41.01 sec     1280 kB
   59.55 sec     1536 kB
   82.85 sec     1792 kB
  110.03 sec     2048 kB
Run Code Online (Sandbox Code Playgroud)

hua*_*uan 5

瓶颈是search功能,可能在SBCL中没有优化.以下版本用于position帮助它跳过不可能的区域,速度大约是我的机器上的#A版本的10倍:

(defun test/e ()
  (declare (optimize speed))
  (labels ((join (x y)
             (concatenate 'simple-base-string x y))
           (seek (old str pos)
             (let ((q (position (char old 0) str :start pos)))
               (and q (search old str :start2 q))))
           (subs (str old new)
             (loop for p = (seek old str 0) then (seek old str p)
                   while p do (replace str new :start1 p))
             str))
    (declare (inline join seek subs))
    (let* ((str (join "abcdefgh" "efghefgh"))
           (ini (get-internal-real-time)))
      (loop for i below (+ 1000 (* 4 1024 1024 (/ (length str))))
            for j = str then (subs (join j str) "efgh" "____")
            for k = (* (length str) i)
            when (= 0 (mod k (* 1024 256)))
              do (format t "~&~8,2F sec ~8D kB"
                         (/ (- (get-internal-real-time) ini) 1000)
                         (/ k 1024))))))
Run Code Online (Sandbox Code Playgroud)