用于文本编辑器的纯功能数据结构

Question

用于文本编辑器的纯功能数据结构

fre*_*low 39 haskell functional-programming scala text-editor data-structures

什么是文本编辑器的纯功能数据结构？我希望能够在文本中插入单个字符并以可接受的效率从文本中删除单个字符,并且我希望能够保留旧版本,因此我可以轻松地撤消更改.

我应该只使用字符串列表并重用不同版本的行吗？

Answer 1

pig*_*ker 54

我不知道这个建议对于"好"的复杂定义是否"好",但它很容易和有趣.我经常设置一个练习来编写Haskell中文本编辑器的核心,并链接我提供的渲染代码.数据模型如下.

首先,我定义了x--elements 列表中的游标是什么,其中游标上可用的信息具有某种类型m.(结果x将是Char或String.)

type Cursor x m = (Bwd x, m, [x])

Run Code Online (Sandbox Code Playgroud)

这Bwd件事只是落后的"snoc-lists".我想保持强烈的空间直觉,所以我在我的代码中扭转局面,而不是在我脑海中.这个想法是最靠近光标的东西是最容易访问的.这就是拉链的精神.

data Bwd x = B0 | Bwd x :< x deriving (Show, Eq)

Run Code Online (Sandbox Code Playgroud)

我提供了一个免费的单例类型来充当游标的可读标记......

data Here = Here deriving Show

Run Code Online (Sandbox Code Playgroud)

......因此,我可以说出某个地方的意义 String

type StringCursor = Cursor Char Here

Run Code Online (Sandbox Code Playgroud)

现在,为了表示多行的缓冲区,我们需要String使用光标在线的上方和下方,以及StringCursor在我们当前正在编辑的线的中间.

type TextCursor = Cursor String StringCursor

Run Code Online (Sandbox Code Playgroud)

这个TextCursor类型是我用来表示编辑缓冲区的状态.这是一个两层拉链.我为学生提供了代码,用于在启用ANSI-escape的shell窗口中渲染文本的视口,确保视口包含光标.他们所要做的就是实现更新TextCursor响应键击的代码.

handleKey :: Key -> TextCursor -> Maybe (Damage, TextCursor)

Run Code Online (Sandbox Code Playgroud)

如果击键没有意义,handleKey应该返回哪里Nothing,否则提供Just更新的TextCursor和"损坏报告",后者是其中之一

data Damage
  = NoChange       -- use this if nothing at all happened
  | PointChanged   -- use this if you moved the cursor but kept the text
  | LineChanged    -- use this if you changed text only on the current line
  | LotsChanged    -- use this if you changed text off the current line
  deriving (Show, Eq, Ord)

Run Code Online (Sandbox Code Playgroud)

(如果您想知道返回Nothing和返回之间的区别是什么Just (NoChange, ...),请考虑您是否还希望编辑器发出蜂鸣声.)损坏报告告诉渲染器需要做多少工作才能使显示的图像保持最新状态.

该Key类型仅为可能的击键提供可读的dataype表示,从原始ANSI转义序列中抽象出来.它并不起眼.

通过提供这些工具包,我为学生们提供了一个关于上下这个数据模型的大线索:

deactivate :: Cursor x Here -> (Int, [x])
deactivate c = outward 0 c where
  outward i (B0, Here, xs)       = (i, xs)
  outward i (xz :< x, Here, xs)  = outward (i + 1) (xz, Here, x : xs)

Run Code Online (Sandbox Code Playgroud)

该deactivate函数用于将焦点移出a Cursor,给你一个普通的列表,但告诉你光标在哪里.相应的activate函数尝试将光标放在列表中的给定位置:

activate :: (Int, [x]) -> Cursor x Here
activate (i, xs) = inward i (B0, Here, xs) where
  inward _ c@(_, Here, [])     = c  -- we can go no further
  inward 0 c                   = c  -- we should go no further
  inward i (xz, Here, x : xs)  = inward (i - 1) (xz :< x, Here, xs)  -- and on!

Run Code Online (Sandbox Code Playgroud)

我向学生提供了故意不正确和不完整的定义 handleKey

handleKey :: Key -> TextCursor -> Maybe (Damage, TextCursor)
handleKey (CharKey c)  (sz,
                        (cz, Here, cs),
                        ss)
  = Just (LineChanged, (sz,
                        (cz, Here, c : cs),
                        ss))
handleKey _ _ = Nothing

Run Code Online (Sandbox Code Playgroud)

它只处理普通的字符击键,但使文本向后出现.人们很容易看到的字符c出现正确的Here.我邀请他们修复错误并添加箭头键,退格键,删除键,返回键等功能.

它可能不是最有效的表示,但它纯粹是功能性的,并使代码能够具体地符合我们关于正在编辑的文本的空间直觉.

Answer 2

Lui*_*hys 10

A Vector[Vector[Char]]可能是一个不错的选择.IndexedSeq与List你提到的不同,它具有不错的更新/前置/更新性能.如果你看一下Performance Characteristics,它是唯一提到的具有有效恒定时间更新的不可变集合.

Answer 3

Don*_*art 7

我们在Yi中使用了一个文本拉链,这是Haskell中一个严肃的文本编辑器实现.

下面描述了不可变状态类型的实现,

http://publications.lib.chalmers.se/records/fulltext/local_94979.pdf

http://publications.lib.chalmers.se/records/fulltext/local_72549.pdf

和其他论文.

Answer 4

Nov*_*zen 5

Answer 5

Pet*_*lák 5

我建议将拉链与基于Finger trees的 Data.Sequence.Seq结合使用。所以你可以将当前状态表示为

data Cursor = Cursor { upLines :: Seq Line
                     , curLine :: CurLine
                     , downLines :: Seq Line }

Run Code Online (Sandbox Code Playgroud)

这为您提供了O(1)复杂度，用于将光标向上/向下移动一行，并且由于splitAt和(><)(union) 都具有O(log(min(n1,n2)))复杂度，因此您将得到O(log(L) )向上/向下跳过L行的复杂性。

您可以使用类似的拉链结构来CurLine保留光标之前、光标处和光标之后的字符序列。

Line可以是节省空间的东西，例如ByteString。

归档时间：	13 年，5 月前
查看次数：	3800 次
最近记录：	13 年，3 月前