我正在尝试精简一个庞大的数据库,以便将相关信息用于 JSON 文件。它有一些很长的行(每行约 400 个字符)和几千个条目,在这些条目中,我需要根据行省略从前到后的所有内容、从前(
到后的所有内容http
或从前MISSING
到后的所有内容。
大多数行不包含()[]
信息,但都包含http
信息。该http
信息始终遵循()
包含它的行上的信息。
这是一个例子,我出于显而易见的原因切断了长度。
PCSH10160 Attack of the Toy Tanks (3.61+!) [3.69] http://zeu
PCSH10162 Paradox Soul http://zeus.dl.playstation.net/cdn
PCSH10146 Hoggy2 http://zeus.dl.playstation.net/cdn/HP2005/
PCSB01394 Mekabolt http://zeus.dl.playstation.net/cdn/EP0
PCSH10186 Himno http://zeus.dl.playstation.net/cdn/HP2
PCSG01285 MELLKISS http://zeus.dl.playstation.net/cdn/JP0
PCSB01365 Habroxia http://zeus.dl.playstation.net/cdn/EP5
PCSE01423 Color Slayer http://zeus.dl.playstation.net/cdn
PCSE01396 Habroxia http://zeus.dl.playstation.net/cdn/UP4
PCSG01127 Sen no Hatou, Tsukisome no Kouki http://zeus.dl
PCSB01396 Tic-Tac-Letters by POWGI http://zeus.dl.playsta
PCSH10203 Gravity Duck http://zeus.dl.playstation.net
PCSH10175 Crossovers by POWGI http://zeus.dl.playstation
PCSH10169 Mixups by POWGI (3.61+!) [3.69] http://zeus.dl
PCSH10167 One Word by POWGI http://zeus.dl.playstation
PCSH10166 Word Search by POWGI http://zeus.dl.playsta
PCSH10179 Word Wheel by POWGI http://zeus.dl.playstation
PCSH10180 Wordsweeper by POWGI http://zeus.dl.playsta
PCSH10168 Word Sudoku by POWGI http://zeus.dl.playsta
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack ht
Run Code Online (Sandbox Code Playgroud)
最终结果应该是
PCSH10160 Attack of the Toy Tanks
PCSH10162 Paradox Soul
PCSH10146 Hoggy2
PCSB01394 Mekabolt
PCSH10186 Himno
PCSG01285 MELLKISS
PCSB01365 Habroxia
PCSE01423 Color Slayer
PCSE01396 Habroxia
PCSG01127 Sen no Hatou, Tsukisome no Kouki
PCSB01396 Tic-Tac-Letters by POWGI
PCSH10203 Gravity Duck
PCSH10175 Crossovers by POWGI
PCSH10169 Mixups by POWGI
PCSH10167 One Word by POWGI
PCSH10166 Word Search by POWGI
PCSH10179 Word Wheel by POWGI
PCSH10180 Wordsweeper by POWGI
PCSH10168 Word Sudoku by POWGI
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack
Run Code Online (Sandbox Code Playgroud)
我不关心 ID 和标题之间的间距,因为这可以手动修复。
哎哟。我胡说八道。运行提供的表达式后,我注意到有少量包含单词的行,MISSING
后跟各种信息。有没有办法将它包含在与(
and旁边的表达式中http
?
或者作为一个单独的表达,它只需要尊重案例,因为我担心“失踪”这个词出现在某个地方的标题中并且它超出了所述点。
PCSG00742 Kiss Ato
PCSG00744 One Piece: Burning Blood - Gold Edition
PCSG00747 Zero Escape: Zero Time Dilemma
PCSG00748 Jikkyou Powerful Pro Yakyuu 2016 MISSING KO5ifR1dQ+d7
PCSG00750 Kai-ri-Sei Million Arthur
PCSG00751 Arcana Famiglia -La Storia Della Arcana Famiglia- Ancora
PCSG00752 Touhou Soujinengi V
PCSG00753 Eikoku Tantei Mysteria: The Crown MISSING KO5ifR1dQ+d7
PCSG00756 I am Setsuna
Run Code Online (Sandbox Code Playgroud)
(
和超越,或一切从http
超越之前:
PCSH10160 Attack of the Toy Tanks (3.61+!) [3.69] http://zeu
PCSH10162 Paradox Soul http://zeus.dl.playstation.net/cdn
PCSH10146 Hoggy2 http://zeus.dl.playstation.net/cdn/HP2005/
PCSB01394 Mekabolt http://zeus.dl.playstation.net/cdn/EP0
PCSH10186 Himno http://zeus.dl.playstation.net/cdn/HP2
PCSG01285 MELLKISS http://zeus.dl.playstation.net/cdn/JP0
PCSB01365 Habroxia http://zeus.dl.playstation.net/cdn/EP5
PCSE01423 Color Slayer http://zeus.dl.playstation.net/cdn
PCSE01396 Habroxia http://zeus.dl.playstation.net/cdn/UP4
PCSG01127 Sen no Hatou, Tsukisome no Kouki http://zeus.dl
PCSB01396 Tic-Tac-Letters by POWGI http://zeus.dl.playsta
PCSH10203 Gravity Duck http://zeus.dl.playstation.net
PCSH10175 Crossovers by POWGI http://zeus.dl.playstation
PCSH10169 Mixups by POWGI (3.61+!) [3.69] http://zeus.dl
PCSH10167 One Word by POWGI http://zeus.dl.playstation
PCSH10166 Word Search by POWGI http://zeus.dl.playsta
PCSH10179 Word Wheel by POWGI http://zeus.dl.playstation
PCSH10180 Wordsweeper by POWGI http://zeus.dl.playsta
PCSH10168 Word Sudoku by POWGI http://zeus.dl.playsta
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack ht
Run Code Online (Sandbox Code Playgroud)
之后:
PCSH10160 Attack of the Toy Tanks
PCSH10162 Paradox Soul
PCSH10146 Hoggy2
PCSB01394 Mekabolt
PCSH10186 Himno
PCSG01285 MELLKISS
PCSB01365 Habroxia
PCSE01423 Color Slayer
PCSE01396 Habroxia
PCSG01127 Sen no Hatou, Tsukisome no Kouki
PCSB01396 Tic-Tac-Letters by POWGI
PCSH10203 Gravity Duck
PCSH10175 Crossovers by POWGI
PCSH10169 Mixups by POWGI
PCSH10167 One Word by POWGI
PCSH10166 Word Search by POWGI
PCSH10179 Word Wheel by POWGI
PCSH10180 Wordsweeper by POWGI
PCSH10168 Word Sudoku by POWGI
PCSB00625 SENRAN KAGURA: Bon Appétit! Stacked Soundtrack ht
Run Code Online (Sandbox Code Playgroud)
笔记:
\(.*?$|http.*?$|MISSING.*?$
按照评论中的对话,最快的正则表达式是
\h+(?:\(|http|MISSING).+$
提高性能(感谢@IsmaelMiguel)并回答新的要求。
\h+(?:\(|http|MISSING).+$
LEAVE EMPTY
. matches newline
解释:
\h+ # 1 or more horizontal spaces
(?: # non capture group
\( # opening parenthesis
| # OR
http # literally
| # OR
MISSING # literally
) # end group
.+ # 1 or more any character but newline
$ # end of line
Run Code Online (Sandbox Code Playgroud)
截图(之前):
截图(后):
归档时间: |
|
查看次数: |
659 次 |
最近记录: |