如何在 Notepad++ 中删除括号中的所有内容或特定单词及其他内容?

kle*_*906 7 notepad++ regex

我正在尝试精简一个庞大的数据库,以便将相关信息用于 JSON 文件。它有一些很长的行(每行约 400 个字符)和几千个条目,在这些条目中,我需要根据行省略从前到后的所有内容、从前(到后的所有内容http或从前MISSING到后的所有内容。

大多数行不包含()[]信息,但都包含http信息。该http信息始终遵循()包含它的行上的信息。

这是一个例子,我出于显而易见的原因切断了长度。

PCSH10160    Attack of the Toy Tanks (3.61+!) [3.69]    http://zeu
PCSH10162    Paradox Soul    http://zeus.dl.playstation.net/cdn
PCSH10146    Hoggy2    http://zeus.dl.playstation.net/cdn/HP2005/
PCSB01394    Mekabolt    http://zeus.dl.playstation.net/cdn/EP0
PCSH10186        Himno    http://zeus.dl.playstation.net/cdn/HP2
PCSG01285    MELLKISS    http://zeus.dl.playstation.net/cdn/JP0
PCSB01365    Habroxia    http://zeus.dl.playstation.net/cdn/EP5
PCSE01423    Color Slayer    http://zeus.dl.playstation.net/cdn
PCSE01396    Habroxia    http://zeus.dl.playstation.net/cdn/UP4
PCSG01127    Sen no Hatou, Tsukisome no Kouki    http://zeus.dl
PCSB01396    Tic-Tac-Letters by POWGI    http://zeus.dl.playsta
PCSH10203        Gravity Duck    http://zeus.dl.playstation.net
PCSH10175        Crossovers by POWGI    http://zeus.dl.playstation
PCSH10169        Mixups by POWGI (3.61+!) [3.69]    http://zeus.dl
PCSH10167        One Word by POWGI    http://zeus.dl.playstation
PCSH10166        Word Search by POWGI    http://zeus.dl.playsta
PCSH10179        Word Wheel by POWGI    http://zeus.dl.playstation
PCSH10180        Wordsweeper by POWGI    http://zeus.dl.playsta
PCSH10168        Word Sudoku by POWGI    http://zeus.dl.playsta
PCSB00625    SENRAN KAGURA: Bon Appétit! Stacked Soundtrack    ht
Run Code Online (Sandbox Code Playgroud)

最终结果应该是

PCSH10160    Attack of the Toy Tanks
PCSH10162    Paradox Soul
PCSH10146    Hoggy2
PCSB01394    Mekabolt
PCSH10186        Himno
PCSG01285    MELLKISS
PCSB01365    Habroxia
PCSE01423    Color Slayer
PCSE01396    Habroxia
PCSG01127    Sen no Hatou, Tsukisome no Kouki
PCSB01396    Tic-Tac-Letters by POWGI
PCSH10203        Gravity Duck
PCSH10175        Crossovers by POWGI
PCSH10169        Mixups by POWGI
PCSH10167        One Word by POWGI
PCSH10166        Word Search by POWGI
PCSH10179        Word Wheel by POWGI
PCSH10180        Wordsweeper by POWGI
PCSH10168        Word Sudoku by POWGI
PCSB00625    SENRAN KAGURA: Bon Appétit! Stacked Soundtrack
Run Code Online (Sandbox Code Playgroud)

我不关心 ID 和标题之间的间距,因为这可以手动修复。

哎哟。我胡说八道。运行提供的表达式后,我注意到有少量包含单词的行,MISSING后跟各种信息。有没有办法将它包含在与(and旁边的表达式中http

或者作为一个单独的表达,它只需要尊重案例,因为我担心“失踪”这个词出现在某个地方的标题中并且它超出了所述点。

PCSG00742    Kiss Ato
PCSG00744    One Piece: Burning Blood - Gold Edition
PCSG00747    Zero Escape: Zero Time Dilemma
PCSG00748    Jikkyou Powerful Pro Yakyuu 2016    MISSING    KO5ifR1dQ+d7
PCSG00750    Kai-ri-Sei Million Arthur
PCSG00751    Arcana Famiglia -La Storia Della Arcana Famiglia- Ancora
PCSG00752    Touhou Soujinengi V
PCSG00753    Eikoku Tantei Mysteria: The Crown    MISSING    KO5ifR1dQ+d7
PCSG00756    I am Setsuna
Run Code Online (Sandbox Code Playgroud)

Dav*_*ill 6

我需要省略一切从(和超越,或一切从http超越

  • 菜单“搜索”>“替换”(或Ctrl+ H

  • 将“查找内容”设置为 \(.*?$|http.*?$

  • 将“替换为”留空

  • 启用“正则表达式”

  • 点击“全部替换”

    在此处输入图片说明

之前

PCSH10160   Attack of the Toy Tanks (3.61+!) [3.69] http://zeu
PCSH10162   Paradox Soul    http://zeus.dl.playstation.net/cdn
PCSH10146   Hoggy2  http://zeus.dl.playstation.net/cdn/HP2005/
PCSB01394   Mekabolt    http://zeus.dl.playstation.net/cdn/EP0
PCSH10186       Himno   http://zeus.dl.playstation.net/cdn/HP2
PCSG01285   MELLKISS    http://zeus.dl.playstation.net/cdn/JP0
PCSB01365   Habroxia    http://zeus.dl.playstation.net/cdn/EP5
PCSE01423   Color Slayer    http://zeus.dl.playstation.net/cdn
PCSE01396   Habroxia    http://zeus.dl.playstation.net/cdn/UP4
PCSG01127   Sen no Hatou, Tsukisome no Kouki    http://zeus.dl
PCSB01396   Tic-Tac-Letters by POWGI    http://zeus.dl.playsta
PCSH10203       Gravity Duck    http://zeus.dl.playstation.net
PCSH10175       Crossovers by POWGI http://zeus.dl.playstation
PCSH10169       Mixups by POWGI (3.61+!) [3.69] http://zeus.dl
PCSH10167       One Word by POWGI   http://zeus.dl.playstation
PCSH10166       Word Search by POWGI    http://zeus.dl.playsta
PCSH10179       Word Wheel by POWGI http://zeus.dl.playstation
PCSH10180       Wordsweeper by POWGI    http://zeus.dl.playsta
PCSH10168       Word Sudoku by POWGI    http://zeus.dl.playsta
PCSB00625   SENRAN KAGURA: Bon Appétit! Stacked Soundtrack  ht
Run Code Online (Sandbox Code Playgroud)

之后

PCSH10160   Attack of the Toy Tanks 
PCSH10162   Paradox Soul    
PCSH10146   Hoggy2  
PCSB01394   Mekabolt    
PCSH10186       Himno   
PCSG01285   MELLKISS    
PCSB01365   Habroxia    
PCSE01423   Color Slayer    
PCSE01396   Habroxia    
PCSG01127   Sen no Hatou, Tsukisome no Kouki    
PCSB01396   Tic-Tac-Letters by POWGI    
PCSH10203       Gravity Duck    
PCSH10175       Crossovers by POWGI 
PCSH10169       Mixups by POWGI 
PCSH10167       One Word by POWGI   
PCSH10166       Word Search by POWGI    
PCSH10179       Word Wheel by POWGI 
PCSH10180       Wordsweeper by POWGI    
PCSH10168       Word Sudoku by POWGI    
PCSB00625   SENRAN KAGURA: Bon Appétit! Stacked Soundtrack  ht
Run Code Online (Sandbox Code Playgroud)

笔记:

  • 最后一个示例行不正确,但会在您针对未截断的文件进行应用时出现。
  • 要截断包含 MISSING 的行,请将“查找内容”更改为 \(.*?$|http.*?$|MISSING.*?$

按照评论中的对话,最快的正则表达式是

  • \h+(?:\(|http|MISSING).+$

进一步阅读


Tot*_*oto 5

提高性能(感谢@IsmaelMiguel)并回答新的要求。


  • Ctrl+H
  • 找什么: \h+(?:\(|http|MISSING).+$
  • 用。。。来代替: LEAVE EMPTY
  • 检查 火柴盒
  • 检查 环绕
  • 检查 正则表达式
  • 取消勾选 . matches newline
  • Replace all

解释:

\h+             # 1 or more horizontal spaces
(?:             # non capture group
    \(              # opening parenthesis
  |               # OR
    http            # literally
  |               # OR
    MISSING         # literally
)               # end group
.+              # 1 or more any character but newline
$               # end of line
Run Code Online (Sandbox Code Playgroud)

截图(之前):

在此处输入图片说明

截图(后):

在此处输入图片说明