我有一个表格:
<textarea name="test">
Run Code Online (Sandbox Code Playgroud)
假设用户输入以下文本:
This is the first paragraph
It has two lines
This is the second paragraph
Run Code Online (Sandbox Code Playgroud)
我想把这个文本分成一个列表["这是第一段\n它有两行","这是第二段"]
我认为这会奏效:
temp = self.request.get("test")
list = re.split(r'\n\n', temp)
Run Code Online (Sandbox Code Playgroud)
但事实并非如此.然而,
temp = self.request.get("test")
list = re.split(r'\n', temp)
Run Code Online (Sandbox Code Playgroud)
产生以下列表:["这是第一行","","这是第二行"]
我错过了什么?
也:
假设在to文本之间可能有一个或两个空行,这是否有意义?
temp = self.request.get("test")
list = re.split(r'(\n){2,3}', temp)
Run Code Online (Sandbox Code Playgroud)
解:
在下面的帮助下,
我发现以下代码适用于我的情况:
temp = self.request.get("test")
list = [l for l in temp.split('\r\n\r\n') if l.split()]
Run Code Online (Sandbox Code Playgroud)
我认为断线可能取决于输入来自哪个系统,因此它可能不是完美的解决方案.
我认为re模块可能有点矫枉过正.只需拆分内容\n并删除空字符串即可.
>>> s = """This is the text
...
... I am interested in splitting,
...
...
... but I want to remove blank lines!"""
>>> lines = [l for l in s.split("\n") if l]
>>> lines
['This is the text', 'I am interested in splitting,', 'but I want to remove blank lines!']
Run Code Online (Sandbox Code Playgroud)
它string.split似乎也快了两倍.
> python -m timeit -s 's = "This is the text\n\nthat I want to split\n\n\nand remove empty lines"; import re;' '[l for l in re.split(r"\n", s) if l]'
100000 loops, best of 3: 2.84 usec per loop
> python -m timeit -s 's = "This is the text\n\nthat I want to split\n\n\nand remove empty lines"' '[l for l in s.split("\n") if l]'
1000000 loops, best of 3: 1.08 usec per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1037 次 |
| 最近记录: |