Lor*_*lus 3 sql sql-server sql-server-2012
我需要将叙述字段(自由文本)拆分为多行。目前的格式如下:
Case_Reference | Narrative
```````````````|`````````````````````````````````````
XXXX/XX-123456 | [Endless_Text up to ~50k characters]
Run Code Online (Sandbox Code Playgroud)
在作为文本的叙述字段中,各个条目(当不同的代理对案件做了某事时)以条目日期开头,后跟两个空格(即'dd/mm/yyyy '),日期值随着同一字段中的每个条目而变化。
换句话说,在寻找更好的分隔符之后,我唯一可以使用的就是这种格式的字符串,因此我需要识别叙述文本中格式(掩码会是更好的单词吗?)匹配的多个位置'dd/mm/yyyy '。
我可以识别多次出现的一致字符串,没有问题,但它在我本质上寻找的地方识别它:
'%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %'
PATINDEX当然返回这个的第一次出现/位置,但据我所知,没有办法“修改”这个(即创建的函数)以允许拾取这个的其余出现/位置我们可以CHARINDEX(因为PATINDEX没有起始位置参数)。
为了清楚起见,我并不是在寻找代码来直接分隔它,因为我需要进一步操作每个条目,因此它纯粹是我正在寻找的叙述文本中多次出现字符串的位置。
任何帮助将非常感激。
为了清楚起见,没有选项可以执行此预导入操作,因此需要对此登陆数据进行此操作。
期望的输出是
Case_Reference1 | 1st_Position_of_Delimiter_String
Case_Reference1 | 2nd_Position_of_Delimiter_String
Case_Reference2 | 1st_Position_of_Delimiter_String
Case_Reference2 | 2nd_Position_of_Delimiter_String
Case_Reference2 | 3rd_Position_of_Delimiter_String
Run Code Online (Sandbox Code Playgroud)
您可以使用递归 CTE 来解决这个问题
DECLARE @tbl TABLE (Case_Reference NVARCHAR(MAX),Narrative NVARCHAR(MAX));
INSERT INTO @tbl VALUES
(N'C1',N'01/02/2000 Some text with blanks 02/03/2000 More text 03/04/2000 An even more')
,(N'C2',N'01/02/2000 Test for C2 02/03/2000 One more for C2 03/04/2000 An even more 04/05/2000 Blah')
,(N'C3',N'01/02/2000 Test for C3 02/03/2000 One more for C3 03/04/2000 An even more')
;
WITH recCTE AS
(
SELECT 1 AS Step,Case_Reference,Narrative,CAST(1 AS BIGINT) AS StartsAt,NewPos.EndsAt+10 AS EndsAt,LEN(Narrative) AS MaxLen
,SUBSTRING(Narrative,NewPos.EndsAt+10+1,999999) AS RestString
FROM @tbl AS tbl
CROSS APPLY(SELECT PATINDEX('%[0-3][0-9]/[0-1][0-9]/[1-2][0-9][0-9][0-9] %',SUBSTRING(Narrative,12,9999999))) AS NewPos(EndsAt)
UNION ALL
SELECT r.Step+1,r.Case_Reference,r.Narrative,r.EndsAt+1,CASE WHEN NewPos.EndsAt>0 THEN r.EndsAt+NewPos.EndsAt+10 ELSE r.MaxLen END,r.MaxLen
,SUBSTRING(r.RestString,NewPos.EndsAt+10+1,999999)
FROM recCTE AS r
CROSS APPLY(SELECT PATINDEX('%[0-3][0-9]/[0-1][0-9]/[1-2][0-9][0-9][0-9] %',SUBSTRING(r.RestString,12,99999999))) AS NewPos(EndsAt)
WHERE r.EndsAt<r.MaxLen
)
SELECT Step,Case_Reference,StartsAt,EndsAt
,SUBSTRING(Narrative,StartsAt,EndsAt-StartsAt+1) AS OutputString
FROM recCTE
ORDER BY Case_Reference,Step
Run Code Online (Sandbox Code Playgroud)
结果
+------+----------------+----------+--------+---------------------------------------+
| Step | Case_Reference | StartsAt | EndsAt | OutputString |
+------+----------------+----------+--------+---------------------------------------+
| 1 | C1 | 1 | 38 | 01/02/2000 Some text with blanks |
+------+----------------+----------+--------+---------------------------------------+
| 2 | C1 | 39 | 60 | 02/03/2000 More text |
+------+----------------+----------+--------+---------------------------------------+
| 3 | C1 | 61 | 84 | 03/04/2000 An even more |
+------+----------------+----------+--------+---------------------------------------+
| 1 | C2 | 1 | 24 | 01/02/2000 Test for C2 |
+------+----------------+----------+--------+---------------------------------------+
| 2 | C2 | 25 | 52 | 02/03/2000 One more for C2 |
+------+----------------+----------+--------+---------------------------------------+
| 3 | C2 | 53 | 77 | 03/04/2000 An even more |
+------+----------------+----------+--------+---------------------------------------+
| 4 | C2 | 78 | 93 | 04/05/2000 Blah |
+------+----------------+----------+--------+---------------------------------------+
| 1 | C3 | 1 | 24 | 01/02/2000 Test for C3 |
+------+----------------+----------+--------+---------------------------------------+
| 2 | C3 | 25 | 52 | 02/03/2000 One more for C3 |
+------+----------------+----------+--------+---------------------------------------+
| 3 | C3 | 53 | 76 | 03/04/2000 An even more |
+------+----------------+----------+--------+---------------------------------------+
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2853 次 |
| 最近记录: |