use*_*326 3 python string bash shell list
我有一个文本文件,lists.txt看起来像这样:
HI family what are u doing ?
HI Family
what are
Channel 5 is very cheap
Channel 5 is
Channel 5 is very
Pokemon
The best Pokemon is Pikachu
Run Code Online (Sandbox Code Playgroud)
我想清理它,删除任何完全包含在其他行中的行.也就是说,我想要这样的东西:
HI family, what are u doing ?
The best Pokemon is Pikachu
Channel 5 is very cheap
Run Code Online (Sandbox Code Playgroud)
我尝试计算大量的字符串,然后将其与grep进行比较,在大的results.txt上找到sorts results.txt,但它没什么效果.
如果我正确理解了您的问题,您需要获取字符串列表并从中删除任何字符串,这些字符串是列表中其他字符串的子字符串.
在伪代码中
outer: for string s in l
for string s2 in l
if s substringOf s2
continue outer
print s
Run Code Online (Sandbox Code Playgroud)
即为每个字符串循环一次字符串,如果内部循环中的任何测试匹配,则取消外部循环的每次运行.
这是bash中该算法的实现.请注意,file(list.txt)<在代码中通过重定向操作符读取两次,一次用于外部循环,一次用于内部.
(我也清理了你的例子,这有很多错别字.)
$ cat list.txt
HI family what are u doin?
HI family what are
Channel 5 is very cheap
Channel 5 is
Channel 5 is very
Pokemon
The best Pokemon is Pikachu
$ while read line; do while read line2; do [[ $line2 != $line ]] && [[ $line2 == *$line* ]] && continue 2; done <list.txt; echo "$line"; done <list.txt
HI family what are u doin?
Channel 5 is very cheap
The best Pokemon is Pikachu
$
Run Code Online (Sandbox Code Playgroud)